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GENE EXPRESSION PROFILING OF PRIMARY BREAST 
CARCINOMAS USING ARRAYS OF CANDIDATE GENES 

This invention relates to polynucleotide analysis 
and, in particular, to polynucleotide expression profiling of 
carcinomas using arrays of candidate polynucleotides . 

Pathologists and clinicians in charge of the 
management of breast cancer patients are facing two major 
problems, namely the extensive heterogeneity of the disease 
and the lack of factors - among conventional histological and 
clinical features - predicting with reliability the evolution 
of the disease and its sensitivity to cancer therapies. 
Breast tumors of the same apparent prognostic type vary 
widely in their responsiveness to therapy and consequent 
survival of the patient. New prognostic and predictive 
factors are needed to allow an individualization of therapy 
for each patient. 

Great hope is currently being placed on molecular 
studies, which address the problem in a global fashion. 
Methods such as cytogenetics, comparative genomic 
hybridization, and whole-genome allelotyping have addressed 
the issue at the genome level. Currently, the modifications 
that take place in human tumors at the level of transcription 
can also be studied in a large, unprecedented scale, using 
new methods such as cDNA arrays that allow quantitative 
measurement of the mRNA expression levels of many genes 
simultaneously. Thus, it would be advantageous to provide a 
means to assess the capacity of cDNA array testing in 
clinical practice to better classify an heterogeneous cancer 
into tumor subtypes with more homogeneous clinical outcomes, 
and to identify new potential prognostic factors and 
therapeutics targets . 
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The invention relates to a polynucleotide library 
useful in the molecular characterization of a carcinoma, the 
library including a pool of polynucleotide sequences or 
subsequences thereof wherein the sequences or subsequences 
are either underexpressed or overpressed in tumor cells, 
further wherein the sequences or subsequences correspond 
substantially to any of the polynucleotide sequences set 
forth in any of SEQ ID NOS : 1 - 468 or the complement 
thereof . 

Fig. l shows an example of differential gene 
expression between normal breast tissue (NB) and breast tumor 
samples . 

Fig. 2 is a representation of expression levels 
of 176 genes in normal breast tissue (NB) and 34 samples of 
breast carcinoma. 

Fig. 3 is prognostic classification of breast 
cancer by gene expression profiling. 

Fig. 4 shows the correlation of GATA3 expression 
with ER phenotype. 



In the context of this disclosure, a number of 
terms shall be utilized. 

The term "polynucleotide" refers to a polymer of 
RNA or DNA that is single-stranded, optionally containing 
synthetic, non-natural or altered nucleotide bases. A 
polynucleotide in the form of a polymer of DNA may be 
comprised of one or more segments of cDNA, genomic DNA or 
synthetic DNA. 

The term "subsequence" refers to a sequence of 
nucleic acids that comprises a part of a longer sequence of 
nucleic acids. 

The term "immobilized on a support" means bound 
directly or indirectly thereto including attachment by 
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covalent binding, hydrogen bonding, ionic interaction, 
hydrophobic interaction or otherwise. 

Breast cancer is characterized by an important 
histoclinical heterogeneity that currently hampers the 
selection of the most appropriate treatment for each case. 
This problem could be solved by the identification of new 
parameters that better predict the natural history of the 
disease and its sensitivity to treatment . An important 
object of the present invention relates to a large-scale 
molecular characterization of breast cancer that could help 
in prediction, prognosis and cancer treatment. 

An important aspect of the invention relates to 
the use of cDNA arrays, which allows to quantitative study 
mRNA expression levels of 188 candidate genes in 34 
consecutive primary breast carcinomas along three directions: 
comparison of tumor samples, correlations of molecular data 
with conventional histoclinical prognostic features and gene 
correlations. The experimentation evidenced extensive 
heterogeneity of breast tumors at the transcriptional level. 
Hierarchical clustering algorithm identified two molecularly 
distinct subgroups of tumors characterized by a different 
clinical outcome after chemotherapy. This outcome could not 
have been predicted by the commonly used histoclinical 
parameters . No correlation was found with the age of 
patients, tumor size, histological type and grade. However, 
expression of genes was differential in tumors with lymph 
node metastasis and according to the estrogen receptor 
status; ERBB2 expression was strongly correlated with the 
lymph node status (p < 0.0001) and that of GATA3 with the 
presence of estrogen receptors (p < 0.001). Thus, 
experimental results identified new ways to group tumors 
according to outcome and new potential targets of 
carcinogenesis. They show that the systematic use of cDNA 
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array testing holds great promise to improve the 
classification of breast cancer in terms of prognosis and 
chemosensitivity and to provide new potential therapeutic 
targets . 

DNA arrays consist of large numbers of" DNA 
molecules spotted in a systematic order on a solid support or 
substrate such as a nylon membrane, glass slide, glass beads 
or a silicon chip. Depending on the size of each DNA spot on 
the array, DNA arrays can be categorized as microarrays (each 
DNA spot has a diameter less than 250 microns) and 
macroarrays (spot diameter is grater than 300 microns) . When 
the solid substrate used is small in size, arrays are also 
referred to as DNA chips. Depending on the spotting 
technique used, the number of spots on a glass microarray can 
range from hundreds to thousands. 

DNA microarrays have serve a variety of purposes, 
including, gene expression profiling, de novo gene 
sequencing, gene mutation analysis, gene mapping and 
genotyping. cDNA microarrays are printed with distinct cDNA 
clones isolated from cDNA libraries. Therefore, each spot 
represents an expressed gene, since it is derived from a 
distinct mRNA. 

Typically, a method of monitoring gene expression 
involves providing (1) providing a pool of sample 
polynucleotides comprising RNA transcript (s) of one or more 
target gene(s) or nucleic acids derived from the RNA 
transcript (s) ; (2) reacting, such as hybridizing the sample 
polynucleotide to an array of probes (for example, 
polynucleotides obtained from a polynucleotide library) 
(including control probes) and (3) detecting the 
reacted/hybridized polynucleotides. Detection can also 
involve calculating/quantifying a relative expression 
(transcription) level. 
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The present invention concerns a polynucleotide 
library useful in the molecular characterization of a 
carcinoma, said library comprising a pool of polynucleotide 
sequences or subsequences thereof wherein said sequences or 
subsequences are either underexpressed or overpressed in 
tumor cells, further wherein said sequences or subsequences 
correspond substantially to any of the polynucleotide 
sequences set forth in any of SEQ ID Nos : 1 - 468 in annex or 
the complement thereof. 

Obviously, sequences having a great degree of 
homology with the above sequences could also been used to 
realize the molecular caracterization of the invention, 
namely when those sequences present one or a few punctual 
mutations when compared with anyone of sequences SEQ ID Nos : 
1 - 468. 

The invention concerns a polynucleotide library useful 
in the molecular characterization of a carcinoma, said 
library comprising a pool of polynucleotide sequences or 
subsequences thereof wherein said sequences or subsequences 
are overpressed in tumor cells, further wherein said 
sequences or subsequences correspond substantially to any of 
the polynucleotide sequences set forth in any of SEQ ID NOS: 
1 - 249 {Here, these SEQ ID N° refer to old SEQ ID N° 1-249 
in priority document, the correlation table 10 allows to 
identify these sequences in the sequence listing of the 
present application in annex ) or the complement thereof 

Preferably the pool of polynucleotide sequences or 
subsequences correspond substantially to the polynucleotide 
sequences set forth in any of SEQ ID NOS: 1 - 247 (Here, 
these SEQ ID N° refer to old SEQ ID N° 1-247 in priority 
document, the correlation table 10 allows to identify these 
sequences in the sequence listing of the present application 
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in annex) ; further wherein said sequences are useful in 
differentiating a normal cell from a cancer cell. 

The invention relates also to a polynucleotide library 
wherein the pool of polynucleotide sequences or subsequences 
correspond substantially to the polynucleotide sequences set 
forth in any of SEQ ID NOS : 1 - 242 (Here, these SEQ ID N° 
refer to old SEQ ID N° 1-242 in priority document, the 
correlation table 10 allows to identify these sequences in 
the sequence listing of the present application in annex) ; 
wherein said sequences are useful in detecting a hormone 
sensitive tumor cell, or wherein said sequences are useful in 
differentiating a tumor with lymph nodes from a tumor without 
lymph nodes . 



The invention relates also to a polynucleotide library 
wherein the pool of polynucleotide sequences or subsequences 
■correspond substantially to the polynucleotide sequences set 
forth in any of SEQ ID NOS : 1 - 224; (Here, these SEQ ID N° 
refer to old SEQ ID N° 1-224 in priority document, the 
correlation table 10 allows to identify these sequences in 
the sequence listing of the present application in annex) 
wherein said sequences are useful in differentiating 
tetracycline-sensitive tumors from tetracycline-insensitive 
tumors . 

The invention relates also to any polynucleotide library 
as previously described wherein said polynucleotides are 
immobilized on a solid support in order to form a 
polynucleotide array. 

Preferably the support is selected from the group 
consisting of a nylon membrane, glass slide, glass beads, or 
a silicon chip. 
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The invention concerns also a method for detecting 
differentially expressed polynucleotide sequences which are 
correlated with a cancer, said method comprising: 

a) obtaining a polynucleotide sample from a patient; and 
5 b) reacting the sample polynucleotide obtained in step 

(a) with a probe immobilized on a solid support wherein said 
probe comprises any of the polynucleotide sequences of the 
libraries previously described or an expression product 
encoded by any of the polynucleotide sequences of said 

10 libraries and 

c) detecting the reaction product of step (b) . 

The invention relates also to a such method for 
detecting differentially expressed polynucleotide sequences 
15 of the invention wherein the amount of reaction product of 

step (c) is compared to a control sample. 

Preferably the polynucleotide sample isolated for, the 
sample is RNA or mRNA. 

Preferably the polynucleotide sample is cDNA obtained by 

2 0 reverse transcription of the mRNA. 

In a prefered embodiment the method for detecting 
differentially expressed polynucleotide sequences, the step 

(b) comprises a hybridization of the sample RNA with the 
labeled probe. 

25 The method for detecting differentially expressed 

polynucleotide sequences is used for detecting, diagnosing, 
staging, monitoring, prognosticating, preventing or treating 
conditions associated with cancer, and namelly breast cancer. 

3 0 The method for detecting differentially expressed 

polynucleotide sequences is particular useful wherein the 
product encoded by any of the polynucleotide sequences or 



BNSDOCID: <WO 0246467A2J_: 



WO 02/46467 



8 



PCT/IB01/02811 



subsequences is involved in a receptor- ligand reaction on 
which detection is based. 

The invention relates also to a method for screening an 
ant i- tumor agent comprising the method for detecting 
differentially expressed polynucleotide sequences previously 
described wherein the sample has been treated with the anti- 
tumor agent to be screened. 



Le label used to label polynucleotide samples is 
selected from the group consisting of radioactive, 
colorimetric, enzymatic, molecular amplification, 

bioluminescent or fluorescent label . 

Yhe invention also relates to a library of 
polynucleotides comprising a population of polynucleotide 
sequences overexpressed or underexpresses in cells derived 
from a tumor selected from SEQ ID NO :l to SEQ ID NO :249 and 
their respective complements. (Here, these SEQ ID N° refer 
to old SEQ ID N° 1-249 in priority document, the correlation 
table 10 allows to identify these sequences in the sequence 
listing of the present application in annex) . 

In a particular embodiment the invention relates 
to polynucleotide sequences: SEQ ID No : 1 ; SEQ ID No : 5 ; 
SEQ ID NO : 8 ; SEQ ID No : 9 ; SEQ ID No : 28 ; SEQ ID No : 29 ; 
SEQ ID No : 30 ; SEQ ID No : 31 ; SEQ ID No : 32 ; SEQ ID No : 45 
; SEQ ID No : 46 ; SEQ ID No : 52 ; SEQ ID No : 54 ; SEQ ID No : 
63 ; SEQ ID No : 64 ; SEQ ID No : 81 ; SEQ ID No : 82 ; SEQ ID No 
: 87 ; SEQ ID No : 88 ; SEQ ID No : 101 ; SEQ ID No : 102 ; SEQ ID 
NO : 103 ; SEQ ID No : 104 ; SEQ ID No : 105 ; SEQ ID No : 107 ; 
SEQ ID No : 113 ; SEQ ID No : 114 ; SEQ ID No : 115 ; SEQ ID No 
: 116 ; SEQ ID No : 127 ; SEQ ID No : 128 ; SEQ ID No : 131 ; SEQ 
ID No : 13 9 ; SEQ ID No : 140 ; SEQ ID No : 142 ? SEQ ID No : 150 
; SEQ ID No : 151 ; SEQ ID No : 154 ; SEQ ID No : 156 ; SEQ ID 
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No : 160 ; SEQ ID No : 161 ; SEQ ID No : 162 ; SEQ ID No : 177 ; 
SEQ ID No : 178 ; SEQ ID No : 194 ; SEQ ID No : 195 ; SEQ ID No : 
227 ; SEQ ID No : 228 ; SEQ ID No : 229 ; SEQ ID No : 231 ; SEQ ID 
No : 233 ; SEQ ID No : 243 ; SEQ ID No : 244 ; SEQ ID No : 245 ; 
5 SEQ ID No : 246 ; SEQ ID No : 247, (Here, these SEQ ID N° refer 

to old SEQ ID N° presented on table 5 in priority document, 
the correlation table 10 allows to identify these sequences 
in the sequence listing of the present application in annex) , 
which distinguish a healthy person from a person with cancer. 
10 Preferably the invention relates to 

polynucleotide sequences: SEQ ID No : 1 ; SEQ ID No : 5 ; SEQ ID 

No : 102 ; SEQ ID No : 103 ; SEQ ID No : 107 ; SEQ ID No : 229 ; 
SEQ ID No : 45 ; SEQ ID No : 46; SEQ ID No : 243 ; SEQ ID No : 
244; SEQ ID No : 245 ; SEQ ID No : 246 ; SEQ ID No : 247 (Here, 
15 these SEQ ID N° refer to old SEQ ID N° presented on table 6 

in priority document, the correlation table 10 allows to 
identify these sequences in the sequence listing of the 
present application in annex) , which distinguish a healthy 
person from a person with cancer. 

20 

In another particular embodiment the invention relates 
to polynucleotide sequences: SEQ ID No : 2 ; SEQ id No : 3 

SEQ ID No : 4 ; SEQ ID No : 5 ; SEQ ID No : 6 ; SEQ ID No : 7 
SEQ ID NO : 8 ; SEQ ID No : 9 ,- SEQ ID No : 10 ; SEQ ID No : 11 
25 SEQ ID No : 12 ; SEQ ID No : 13 ; SEQ ID No : 14 ; SEQ ID No : 15 

; SEQ ID No : 16 ; SEQ ID No : 17 ; SEQ ID No : 18 ; SEQ ID No : 
19 ; SEQ ID No : 20 ; SEQ ID No : 21 ; SEQ ID No : 22 ; SEQ ID No 
: 23 ; ; SEQ ID No : 24 ; SEQ ID No : 25 ; SEQ ID No : 26 ; SEQ ID 
No : 27 ; SEQ ID No : 221 ; SEQ ID No : 222 ; SEQ ID No : 223 ; 

3 0 SEQ ID No : 241 ; SEQ ID No : 242 (Here, these SEQ ID N° refer 

to old SEQ ID N° presented on table 7 in priority document, 
the correlation table 10 allows to identify these sequences 
in the sequence listing of the present application in annex) 
which detect hormone sensitive tumors. 

35 
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Preferably the invention relates to polynucleotide 
sequences SEQ ID No : 1; SEQ ID No : 2 SEQ ID No : 3; SEQ ID No 
: 4; SEQ ID No : 5; SEQ ID No : 221; SEQ ID No : 222 ; SEQ ID 
No : 15; SEQ ID No : 16; SEQ ID No : 17; SEQ ID No : 18 ; SEQ ID 
5 No : 19; SEQ ID No : 20 ; SEQ ID No : 21; SEQ ID No : 22 ; SEQ ID 

No : 24i ; SEQ id No : 242 (Here, these SEQ ID N° refer to old 
SEQ ID N° presented on table 8 in priority document, the 
correlation table 10 allows to identify these sequences in 
the sequence listing of the present application in annex), 
10 which detect hormone sensitive tumors. 

In another particular embodiment the invention 
relates to polynucleotide sequences: SEQ id No : l ; SEQ ID No 
: 3 ; SEQ ID No : 4 ; SEQ ID No : 19 - SEQ ID No : 20 ; SEQ ID No 

15 = 2 1; SE Q 10 No : 22 ; SEQ ID No : 23 ; SEQ ID No : 26 ; SEQ ID 

No : 27 ; SEQ ID No : 28 ; SEQ ID No : 29 ; SEQ ID No : 30 ; SEQ 
ID No : 31 ; SEQ ID No : 32 ; SEQ ID No : 33 ; SEQ ID No : 34 ; 
SEQ ID No : 35 ; SEQ ID No : 36; SEQ ID No : 37; SEQ ID No : 38; 
SEQ ID No : 39; SEQ ID No : 40 ; SEQ ID No : 41 ; SEQ ID No : 42 

2 0 ; SEQ ID No : 43 ; SEQ ID No : 44 ; SEQ ID No : 221 ; SEQ ID No : 

222 ; SEQ ID No : 233 ; SEQ ID No : 241 ; SEQ ID No : 242 
(Here, these SEQ ID N° refer to old SEQ ID N° presented on 
table 8 in priority document, the correlation table 10 allows 
to identify these sequences in the sequence listing of the 

25 present application in annex) , which distinguish tumors with 

lymphe node from tumors with no lymphe node. 

Preferably the invention relates to 
polynucleotide sequences : SEQ ID No : 1 ; SEQ ID No : 21 ; SEQ 

30 ID No : 22 ; SEQ ID No : 28; ; SEQ ID No : 29 ; SEQ ID No : 29 ; 

SEQ ID No : 31 ; SEQ ID No : 32 ; SEQ ID No : 19 ; SEQ ID No : 20 
; SEQ ID No : 26 ; SEQ ID No : 27 ; SEQ ID No : 37 ; SEQ ID No : 
38 ; SEQ ID No : 39 ; SEQ ID No : 241 ; SEQ ID No : 241, (Here, 
these SEQ ID N° refer to old SEQ ID N° presented on table 8 

35 in priority document, the correlation table 10 allows to 
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identify these sequences in the sequence listing of the 
present application in annex) , which distinguish tumors with 
lymphe node from tumors with no lymphe node. 

In another particular embodiment the invention relates 
to polynucleotide sequences: SEQ ID No : 1 ; SEQ ID No : 2 ; 
SEQ ID No : 6 ; SEQ ID No : 7 ; SEQ ID No : 8 ; SEQ ID No : 9 ; 
SEQ ID No : 10 ; SEQ ID No : 11 ; SEQ ID No : 13 ; SEQ ID No : 14 
; SEQ ID No : 19 ; SEQ ID No : 20 ; SEQ ID No : 21 ; SEQ ID No : 
22 ; SEQ ID No : 23 ; SEQ ID No : 35 ; SEQ ID No : 35 ; ; SEQ ID 
No : 37 ; SEQ ID No : 56 ; SEQ ID No : 57 ; SEQ ID No : 74 ; SEQ 
ID No : 75 ; SEQ ID No : 102 ; SEQ ID No : 104 ; SEQ ID No : 107 
; SEQ ID No : 108 ; SEQ ID No : 109 ; SEQ ID No : 118 ; SEQ ID No 
: 119 ; ; SEQ ID No : 13 6 ; SEQ ID No : 213 ; SEQ ID No : 214 ; 
SEQ ID No : 215 ; SEQ ID No : 223 ; SEQ ID No : 224 (Here, these 
SEQ ID N° refer to old SEQ ID N° presented on table 11 in 
priority document, the correlation table 10 allows to 
identify these sequences in the sequence listing of the 
present application in annex) which distinguish tumors 
sensitive to antracycline from tumors unsensitive to 
antracycline . 

The invention relates also to a method of detecting 
differentially expressed genes correlated with a cancer 
comprising detecting at least one library of polynucleotide 
sequences as above defined or of products encoded by said 
library in a sample obtained from a patient. 

A particular embodiment of the invention relates 
to a polynucleotide library of corresponding substantially to 
any combination of at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets 1 to set 212 as defined in 
table 4 
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The invention relates obviously to polynucleotide 
libraries comprising at least one polynucleotide selected 
among those included in at least 50%, preferably 75% and more 
preferably 100% of said predefined sets, allowing to obtain a 
discriminating gene pattern, namely to distinguish between 
normal patients and patients suffering from tumor pathology, 
between patients having an hormone sensitive tumor and 
patients having an hormone resistant tumor, between patients 
having a tumor with lymph nodes from patients having a tumor 
without lymph nodes, between patients having an antracycline- 
sensitive tumor from patients having an antracycline- 
insensitive tumor and between patients having good prognosis 
primary breast tumors and patients having poor prognosis 
primary breast tumors. 

Polynucleotide sequences library useful for the 
realization of the invention can comprise also any sequence 
comprised between 3 ' end and 5 'end of each polynucleotide 
sequence set as defined in table 4, allowing the complete 
detection of the implicated genes. 

The invention relates also to a polynucleotide 
library useful to differentiate a normal cell from a cancer 
cell wherein the pool of polynucleotide sequences or 
subsequences correspond substantially to any combination of 
at least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets indicated on table 5, useful in differentiating a normal 
cell from a cancer cell. 

Preferably the polynucleotide library useful to 
differentiate a normal cell from a cancer cell correspond 
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substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets indicated on 
table 5A, and of at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets indicated in table 5B. 

The detection of an overexpression of genes 
identified with sets of polynucleotides sequences defined on 
table 5A, together with detection of an underexpression of 
genes identified with sets of polynucleotides sequences 
defined in table 5B allows to distinguish between normal 
patients, and patients suffering from tumor pathology. 

The invention relates also to a polynucleotide 
library useful to detect a hormone sensitive tumor cell 
wherein the pool of polynucleotide sequences or subsequences 
correspond substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 6 

Preferably the polynucleotide library useful to 
detect a hormone sensitive tumor cell correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 6A together with at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets defined in table 6B. 

The detection of an overexpression of genes 
identified with sets of polynucleotides sequences defined on 
table 6A, together with detection of an underexpression of 
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genes identified with sets of polynucleotides sequences 
defined in table 6B allows to distinguish between patients 
having an hormone sensitive tumor and patients having an 
hormone resistant tumor. 



The invention concerns also a polynucleotide 
library useful to differentiate a tumor with lymph nodes from 
a tumor without lymph nodes wherein the pool of 
polynucleotide sequences or subsequences correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 7. 

Preferably, the polynucleotide library useful to 
differentiate a tumor with lymph nodes from a tumor without 
lymph nodes correspond substantially to any combination of at 
least one polynucleotide sequence selected among those 
Included in each one of predefined polynucleotide sequences 
sets defined in table 7A together with at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 7B. 

The detection of an overexpression of genes 
identified with sets of polynucleotides sequences defined on 
table 7A, together with detection of an underexpression of 
genes identified with sets of polynucleotides sequences 
defined in table 7B allows to distinguish between patients 
having a tumor with lymph nodes from patients having a tumor 
without lymph nodes . 

The invention concerns also a polynucleotide 
library useful to differentiate antracycline-sensitive tumors 
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from antracycline-insensitive tumors wherein the pool of 
polynucleotide sequences or subsequences correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 8. 

Preferably, the polynucleotide library useful to 
differentiate antracycline- sensitive tumors from 

antracycline-insensitive tumors correspond substantially to 
any combination of at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets defined in table 8A together 
with at least one polynucleotide sequence selected among 
those included in each one of predefined polynucleotide 
sequences sets defined in table 8B. 

The detection of an overexpression of genes 
identified with sets of polynucleotides sequences defined on 
table 8A, together with detection of an underexpression of 
genes identified with sets of polynucleotides sequences 
defined in table 8B allows to distinguish between patients 
having an antracycline-sensitive tumor from patients having 
an antracycline-insensitive tumor. 

The invention concerns also a polynucleotide 
library useful to classify good and poor prognosis primary 
breast tumors wherein the pool of polynucleotide sequences or 
subsequences correspond substantially to any combination of 
at least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets defined in table 9. . 

Preferably, the polynucleotide library useful to 
classify good and poor prognosis primary breast tumors 
correspond substantially to any combination of at least one 



0246467A2J > 



WO 02/46467 PCT/IBO 1/028 11 

16 

polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 9A together with at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets defined in table SB. 

The detection of an overexpression of genes 
identified with sets of polynucleotides sequences defined on 
table 9A, together with detection of an underexpression of 
genes identified with sets of polynucleotides sequences 
defined in table SB allows to classify patients having good 
and poor prognosis primary breast tumors . 

In a preferred embodiment, the tumor cell 
presenting underexpressed or overpressed sequences from the 
polynucleotide library of the invention are breast tumor 
cells . 

In a particular embodiment the polynucleotides of 
the polynucleotide library of the present invention are 
immobilized on a solid support in order to form a 
polynucleotide array, and said solid support is selected from 
the group consisting of a nylon membrane, nitrocellulose 
membrane, glass slide, glass beads, membranes on glass 
support or a silicon chip. 

Another object of the present invention concerns 
a polynucleotide array useful for prognosis or diagnostic of 
tumor comprising at least one immobilized polynucleotide 
library set as previously defined. 

Then the invention concerns a polynucleotide 
array useful to differentiate a normal cell from a cancer 
cell comprising any combination of at least one 
polynucleotide sequence selected among those included in each 
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one of predefined polynucleotide sequences sets indicated on 
table 5, useful in differentiating a normal cell from a 
cancer cell. 

Preferably the polynucleotide array useful to 
differentiate a normal cell from a cancer cell bears any 
combination of at least one polynucleotide sequence selected 
among those included in each one of predefined polynucleotide 
sequences sets indicated on table 5A, and of at least one' 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets indicated in 
table 5B. 

The invention relates also to a polynucleotide 
array useful to detect a hormone sensitive tumor cell 
comprising any combination of at least one polynucleotide 
sequence selected among those included in each one of 
predefined polynucleotide sequences sets defined in table 6 

Preferably the polynucleotide array useful to 
detect a hormone sensitive tumor cell bears any combination 
of at least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets defined in table 6A together with at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 6B. 

The invention concerns also a polynucleotide 
array useful to differentiate a tumor with lymph nodes from a 
tumor without lymph nodes comprising any combination of at 
least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets defined in table 7. 
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Preferably, the polynucleotide array useful to 
differentiate a tumor with lymph nodes from a tumor without 
lymph nodes bears any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 7A together with at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets defined in table 7B. 

The invention concerns also a polynucleotide 
array useful to differentiate antracycline-sensitive tumors 
from antracycline-insensitive tumors comprising any 
combination of at least one polynucleotide sequence selected 
among those included in each one of predefined polynucleotide 
sequences sets defined in table 8. 

Preferably, the polynucleotide array useful to 
differentiate antracycline-sensitive tumors from 

antracycline-insensitive tumors bears any combination of at 
least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets defined in table 8A together with at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 8B. 

The invention concerns also a polynucleotide 
array useful to classify good and poor prognosis primary 
breast tumors comprising any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets defined in 
table 9. 
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Preferably, the polynucleotide array useful to 
classify good and poor prognosis primary breast tumors bears 
any combination of at least one polynucleotide sequence 
selected among those included in each one of predefined 
5 polynucleotide sequences sets defined in table 9A together 

with at least one polynucleotide sequence selected among 
those included in each one of predefined polynucleotide 
sequences sets defined in table 9B. 

10 The present invention concerns also a method for 

detecting differentially expressed polynucleotide sequences 
that are correlated with a cancer, said method comprising: 

a) obtaining a polynucleotide sample from a 
patient; and 

15 b) reacting the sample polynucleotide obtained in 

step (a) with a probe immobilized on a solid support wherein 
said probe comprises any of the polynucleotide sequences of 
the libraries previously defined or an expression product 
encoded by any of the polynucleotide sequences of the 

20 libraries previously defined 

c) detecting the reaction product of step (b) . 

Preferably, the polynucleotide sample obtained at 
step (a) is labeled before its reaction at step (b) with the 
25 probe immobilized on a solid support. 

The label of the polynucleotide sample is 
selected from the group consisting of radioactive, 
colorimetric , enzymatic, molecular amplification, 

3 0 bioluminescent or fluorescent. 
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In a particular embodiment the reaction product 
of step (c ) is quantified by further comparison of said 
reaction product to a control sample. 

In a first embodiment, the polynucleotide sample 
isolated from the patient and obtained at step (a) is either 
RNA or mRNA. 

In another embodiment the polynucleotide sample 
isolated from the patient is cDNA is obtained by reverse 
transcription of the mRNA . 

Preferably the reaction step (b) of the method 
for detecting differentially expressed polynucleotide 
sequences comprises a hybridization of the sample RNA issued 
from patient with the probe. 

Preferably the sample RNA is labeled before 
hybridization with the probe and the label is selected from 
the group consisting of radioactive, colorimetric, enzymatic, 
molecular amplification, bioluminescent or fluorescent. 

This method for detecting differentially 
expressed polynucleotide sequences is particularly useful for 
detecting, diagnosing, staging, monitoring, prognosticating, 
preventing or treating conditions associated with cancer, and 
particularly breast cancer. 

The method for detecting differentially expressed 
polynucleotide sequences is also particularly useful when the 
product encoded by any of the polynucleotide sequences or 
subsequences set is involved in a receptor- ligand reaction on 
which detection is based. 

The present invention is also related with a 
method for screening an ant i- tumor agent comprising the 
method the above-depicted method for detecting differentially 
expressed polynucleotide sequences wherein the sample has 
been treated with the anti-tumor agent to be screened. 
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In a particular embodiment the method for 
screening an anti -tumor agent comprises detecting 
polynucleotide sequences reacting with at least one library 
of polynucleotides or polynucleotide sequences set as 
previously defined or of products encoded by said library in 
a sample obtained from a patient . 

The invention is illustrated by examples detailed 
below related to particular experimental results obtained 
with selected libraries of polypeptides useful to identify 
and distinguish tumor samples from normal ones. 

Tumor samples and RNA extraction 

To avoid any bias of selection as to the type and 
size of the tumors, the RNAs to be tested were prepared from 
unselected samples. Samples of primary invasive breast 
carcinomas were collected from 34 patients undergoing surgery 
at the Institute Paoli-Calmette . After surgical resection, 
the tumors were macrodissected : a section was taken for the 
pathologist's diagnosis and an adjacent piece was quickly 
frozen in liquid nitrogen for molecular analyses. The median 
age of patients at the time of diagnosis was 55 years (range 
39, 83) and most of them were post -menopausal . Tumors were 
classified according to the WHO histological typing of breast 
tumors in: 2 9 ductal carcinomas, 2 lobular carcinomas, 1 
mixed ductal and lobular carcinoma, and 2 medullar 
carcinomas. They had various sizes, inferior or equal to 20 
mm (n = 13) , between 20 and 50 mm (n = 18) or superior to 50 
mm (n = 3), axillary's lymph node status (negative: 19 
tumors, positive: 15 tumors), SBR grading (I: 3 tumors, II: 
20 tumors, III: 10 tumors, not evaluable: 1 tumor), and 
estrogen receptor status (ER) evaluated by 



0246467A2 ! > 



WO 02/46467 



22 



PCT/1B0 1/028 11 



imraunohistochemical assay (23 ER-positive, 11 ER-negative) . 
ER positivity cutoff value was 10%. Adjuvant treatment with 
radiotherapy and when necessary multi-agent anthracyclin- 
based chemotherapy (n = 16) was given to patients according 
to local practice. 

Total RNA was extracted from tumor samples by 
standard methods (43). Total RNA from normal breast tissue 
was obtained from Clontech (Palo Alto, CA) : RNA was isolated 
from 8 tissue specimens from Caucasian females, age range 23 
- 47. RNA integrity was controlled by denaturing 

formaldehyde agarose gel electrophoresis and Northern blots 
using a 28S-specific oligonucleotide. 

cDNA arrays preparation 

Gene expression was analyzed by hybridization of 
arrays with radioactive probes. The arrays contained PCR 
products of 5 control clones, and 180 IMAGE human cDNA clones 
selected with practical criteria (3' sequence of mRNA, same 
cloning vector, host bacteria and insert size) . This 
represented 176 genes (4 genes were represented by 2 
different clones) : 121 with proven or putative implication 
in cancer and 55 implicated in immune reactions (the list is 
available on the web site: http : /tagc . univ- 

mrs.fr/pub/Cancer/). Their identity was verified by 5' tag- 
sequencing of plasmid DNA and comparison with sequences in 
the EST (dbEST) and nucleotide (GenBank) databases at the 
NCBI. Identity was confirmed for all but 14 clones without 
significant gene similarity, which were referenced by their 
GenBank accession number. The control clones were: 

Arabidopsis thaliana cytochrome c554 gene (used for 
hybridization signal normalization), 3 poly (A) sequences of 
different sizes and the vector pT7T3D (negative controls) . 
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PCR amplification, purification and robotical 
spotting of PCR products onto Hybond-N+ membranes (Amersham) 
were done according to described protocols (4) . All PCR 
products were spotted in duplicate. For normalization 
5 purpose, the c554 gene was spotted 96-fold scattered over the 

whole membrane . 

cDNA array hybridizations 

Hybridizations were done successively with a 

10 vector oligonucleotide (to precisely determine the amount of 

target DNA accessible to hybridization in each spot) , then 
after stripping of vector probe, with complex probes made 
from the RNAs (4) . Each complex probe was hybridized to a 
distinct filter. Probes were prepared from total RNA with an 

15 excess of oligo(dT25) to saturate the poly (A) tails of the 

messengers, and to insure that the reverse transcribed 
product did not contain long poly(T) sequences. A precise 
amount of c554 mRNA was added to the total RNA before 
labeling to allow normalization of the data. 

20 Five ng of total RNA (~100ng of mRNA) from tissue 

samples were used for each labeling. Probe preparation and 
hybridization of the membranes were done according to known 
procedures (http : /tagc .univ-mrs . f r/pub/Cancer/ ) . 

Hybridization was done in excess of target (-15 

25 ng of DNA in each spot) and binding of cDNAs to the targets 

was linear and proportional to the quantity of cDNA in the 
probe . 

Detection and quantification of cDNA array 
3 0 hybridization signals 

Quantitative data were obtained using an imaging 
plate device. Hybridization signal detection with a FUJI BAS 
1500 machine and quantification with the HDG Analyzer 
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software (Genomic Solutions, Ann Arbor, MI) were done as 
previously described (http : /tagc . univ-mrs . f r/pub/Cancer/ ) . 
Quantification was done by integrating all spot pixel 
intensities and substracting a spot background value 
determined in the neighboring area. Spots were located with a 
LaPlacian transformation. Spot background level was the 
median intensity of all the pixels present in a small window 
centered on the spot and which were not part of any spot 
(44) . Quantified data were normalized in three steps and 
expressed as absolute gene expression levels (i.e. in 
percentage of abundance of individual mRNA with respect to 
mRNA within the sample), as described (4) . 

Array data analysis 

Before analysis of the results, the 
reproducibility of the experiments was verified by comparing 
duplicate spots, or one hybridization with the same probe on 
two independent arrays, or two independent hybridizations 
with probes prepared from the same RNA. in every case, the 
results showed good reproducibility with respective 
correlation coefficients of 0.95, 0.98 and 0.98 (data not 
shown) . Moreover, genes represented by two different clones 
on the array, such as CDK4 or ETV5 , displayed similar 
expression profiles for the two clones in all samples. This 
reproducibility was sufficient enough to consider a 2-fold 
expression difference as significantly differential. 

For graphical representation, data were displayed 
as absolute expression levels (Fig. 2a) . For better 
visualization of clustering, results were log-transformed and 
displayed as relative values median-centered in each row and 
in each column (Fig. 2b) . Hierarchical clustering was 
applied to the tissue samples and the genes using the Cluster 
program developed by Eisen (45) (average linkage clustering 
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using Pearson correlation as similarity metric) . Results in 
Figs. 2 and 3 were displayed with the TreeView program (45) . 

Subsequent analysis was done using Excel software 
(Microsoft) and statistical analyses with the SPSS software. 
Metastasis- free survival and overall survival were measured 
from diagnosis until the first metastatic relapse or death 
respectively. They were estimated with the Kaplan-Meier 
method and compared between groups with the Log-Rank test. 
Correlations of gene pairs based on expression profiles were 
measured with the correlation coefficient r. The search for 
genes with expression levels correlated with tumor parameters 
was done in several successive steps . 

First, genes were detected by comparing their 
median expression level in the two subgroups of tumors 
discordant according to the parameter of interest. The median 
values rather than the mean values were used because of the 
high variability of the expression levels for many genes, 
resulting in a standard deviation of expression level similar 
or superior to the mean value and making comparisons with 
means impossible. Second, these detected genes were 
inspected visually on graphics, and finally, an appropriate 
statistical analysis was applied to those that were 
convincing to validate the correlation. Comparison of GATA3 
expression between ER-positive tumors and ER-negative tumors 
was validated using a Mann-Witney test. Correlation 
coefficients were used to compare the gene expression levels 
to the number of axillary nodes involved. 

Northern blot analysis 

Seventy-nine breast tumors, including 22 of the 
34 tested on the arrays, were analyzed for GATA3 expression 
by Northern blot hybridization. RNA extraction from tumor 
samples and Northern blots were done as previously described 
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(43) . The GATA3 probe was prepared from the IMAGE cDNA clone 
129757, which corresponds to the 3' region (from +843 to 
+1689} of the GATA3 cDNA sequence (GenBank accession no. 
X55122) . The insert (846 bp) was obtained by digestion of 
the clone with EcoRI and Pad enzymes. Northern blots were 
stripped and re-hybridized using a a-actin probe (46) . 

Fig. 1 shows an example of differential gene 
expression between normal breast tissue (NB) and breast tumor 
samples. Each cDNA array on Nylon filter was hybridized with 
a complex probe made from 5 jig of total RNA. The top image 
corresponds to the whole membrane. For the two bottom 
images, only the right portion of the membranes is shown. 
Numbers below the spots indicate housekeeping genes (1, GAPDH 
and 2, actin) , negative control clones (3, 4 and 5) and 
examples of genes differentially expressed between NB and 
breast tumor (6, stromelysin3 ; 7, ERBB2 ; 8, MYBL2; 9, FOS; 
10, TGFaR3; 11, desmin) , and between ER- breast tumor and ER+ 
breast tumor (12, GATA3). 

Fig. 2 is a representation of expression levels 
of 176 genes in normal breast tissue (NB) and 34 samples of 
breast carcinoma. Each column corresponds to a single 
tissue, and each row to a single gene. (a) The results are 
expressed as percentage abundance of individual mRNA within 
the sample, and are represented using a blue color scale. 
The color scale (log scale with a 3-fold interval) indicated 
at the bottom left ranges from light blue (expression level 
0.001%) to dark blue (expression level > 3%). White squares 
indicate clones with undetectable expression levels and gray 
squares indicate missing data. The tissue samples are 
arbitrarily ordered and the clones are ordered from top to 
bottom according to increasing median expression levels. 
Horizontal black arrows on the right of the figure mark three 
clones with highly variable expression levels between the 
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tumors (stromelysin3, IGF2, GATA3 from top to bottom). (b) 
The results are shown as relative expression levels (relative 
to the median value of each row and each column) and are 
represented with a color scale indicated at the bottom left 
5 ranging from 1/100 to 100 fold changes (gray squares: missing 

data) . Eighteen clones with median expression level equal to 
zero in the 34 tumors are omitted. The clustering program 
arranges samples (n = 35) along the horizontal axis so that 
those with the most similar expression profiles are placed 
10 adjacent to each other. Similarly, clones (n = 162) are near 

each other along the vertical axis if they show a strong 
expression profile correlation across all tissues. The 
length of the branches of the dendrograms capturing 
respectively the samples (top) and the clones (left) reflects 
15 the similarity of the related elements. Two groups of tumors 

are separated and color coded: group A (blue) and group B 
(orange) . Horizontal black and horizontal red arrows on the 
right of the figure respectively mark three genes with highly 
variable expression levels between the tumors (IGF2, GATA3 , 
2 0 stromelysin3 from top to bottom) and four pairs of different 

clones representing four genes. (c) Zoom representation of 
group A from Figure 2b, excluding the two outlyer tumors at 
the right. The clustering separates two subgroups of tumors, 
Al and A2 . The dotted branches correspond to tumors 
25 associated with metastatic relapse and death. Follow-up was 

longer in A2 than in Al (median 81 months vs 47 for Al) . 

Fig. 3 is prognostic classification of breast 
cancer by gene expression profiling showing that gene 
expression-based tumour classification correlates with 
30 clinical outcome. The 12 samples of group A (see figure 2b 

and 2c) were reclustered using the top 32 differentially 
expressed genes between Al and A2 subgroups. Data were 
displayed as in Fig. 2b and shown with the same color key. 
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The hierarchical clustering was applied to expression data 
from the 23 clones, out of 32, of which expression levels 
presented an at least two-fold change in at least two samples 
(out of 12) . Two subgroups of tumors Al and A2 are shown as 
well as two groups of differentially expressed clones. The 
dotted branches of tumor cluster Al correspond to samples 
associated with metastatic relapse and death. Figure 3a shows 
Two-dimensional representation of hierarchical clustering 
results shown in figures 2a and 2b. The analysis delineates 4 
groups of tumours A, B, C and D. Black squares- indicate 
patients alive at last follow-up visit and red squares 
indicate patients who died. Three classes of patients with a 
statistically different clinical outcome were defined 
according to gene expression profiles: class A (n = 16), 
class B+C (n = 34), class D (n = 5) . Figure 3b illustrates 
Kaplan-Meier plot of overall survival of the 3 classes of 
patients (p<0.005, log-rank test). And figure 3c illustrates 
.Kaplan-Meier plot of metastasis-free survival of the 3 
classes of patients (p<0.05, log-rank test). 

Fig. 4 shows the correlation of GATA3 expression 
with ER phenotype. (a) The expression levels of GATA3 in 34 
breast cancer samples (y axis) monitored by cDMA array 
analysis are reported in percentage of abundance of 
individual mRNA with respect to mRNA within the sample (log 
scale) . GATA3 is significantly overexpressed in the ER- 
positive tumors (n = 23) versus the ER-negative tumors (n = 
11) using the Mann-Witney test (p = 0.0004). The expression 
level of GATA3 in normal breast tissue is reported on the 
right (NB) . (b) Northern blot analysis of GATA3 in normal 
breast sample (NB) and 9 breast cancer samples (AT: tumor 
analyzed with cDNA array and Northern blot; NT: tumor 
analyzed with Northern blot) . Blots were probed successively 
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with cDNA from GATA3 (top) and a-actin (bottom) . ER status 
is indicated for each tumor sample. 

Data representation 

Fig. 1 shows examples of hybridizations of cDNA 
arrays with probes made from RNA extracted from normal breast 
tissue and breast tumors. 

The crude results of all hybridizations were 
processed -to be presented either as absolute or relative 
values in schematic figures. The normalization procedure 
allowed display of absolute values expressed in percent of 
abundance of mRNA in the probe as shown in Fig. 2a. Each 
level of the blue color ladder represents a 3 -fold interval 
of absolute abundance of mRNA. Each column corresponds to a 
tissue sample and each row to a gene. For graphic purposes, 
genes were ordered from top to bottom according to increasing 
median expression levels. Tumor samples were not ordered. 
The values in each sample displayed a wide range of 
intensities (3 decades in log scale) corresponding to 
expression levels ranging from approximately 0.002% to 5% of 
mRNA abundance. Many genes (see for example stromelysin 3, 
IGF2 and GATA3 , arrows) displayed highly variable expression 
levels across all tumor samples, scattered over the whole 
dynamic range of values. A representation of relative values 
is shown in Fig. 2b. Absolute values were log-transformed, 
omitting 18 clones whose median intensity was equal to zero 
across all tissues. Data for each of the 162 remaining 
clones were then median- centered, as well as data for each 
sample, so that the relative variation was shown, rather than 
the absolute intensity. A color scale was used to display 
data: red for expression level higher than the median and 
green for expression level lower than the median. The 
magnitude of the deviation from the median was represented by 
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the color intensity. A hierarchical clustering program was 
then applied to group the 35 samples according to their 
overall gene expression profiles, and to group the 162 clones 
on the basis of similarity of their expression levels in all 
tissues. This resulted in a picture highlighting groups of 
correlated tissues and groups of correlated genes as depicted 
by dendrograms . 

Breast tumor classification 

As shown in Fig. 2b, the clustering algorithm 
identified two groups of samples, designated A (n = 15, 
including normal breast, NB) and B (n = 20) . These groups 
were similar with respect to patient age, menopausal status 
at diagnosis, SBR grading and tumor pathological size. 
However, 72% of tumors in group A were node-positive and 75% 
in group B were node-negative . Moreover, 8 0% of the tumors 
in group B were estrogen receptor (ER) positive and 50% in 
group A were ER-negative. With a median follow-up of 44 
months after diagnosis, overall survival was different 
between A and B groups: 5 women died in A (median follow-up 
58 months) and 1 in B (median follow-up 40 months) . But the 
frequency of metastatic relapse was relatively similar in the 
two groups, with 5 women who relapsed in A and 6 in B . 
Because the time between the diagnosis of metastasis and last 
follow-up is too short in B, a longer follow-up is needed to 
determine if these two different groups, defined with 
expression profiles, have really a different outcome with 
respect to overall survival. 

In the group A of 15 samples, three samples 
(normal breast and two tumors) were different from each other 
and from the other 12 samples. The latter constituted two 
subgroups of tumors, Al (n = 6) and A2 (n = 6), which could 
be further separated by clustering as shown in Fig. 2c. The 
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12 tumors had an uniformly high risk of metastatic relapse 
according to conventional prognostic features as shown in 
Table 1. Most of them had received comparable adjuvant 
anthracyclin-based chemotherapy after surgery, with more 
5 women treated in the Al subgroup. Interestingly, these two 

subgroups, which could not be distinguished with commonly 
used histoclinical features, had a very different clinical 
outcome : there were 4 metastatic relapses and. 4 deaths in Al 
(median follow-up: 44 months) . In contrast and despite a 
10 longer median follow-up (90 months) , no metastasis or death 

occurred in A2 . This resulted in a significant better 
metastasis-free survival (p 0.01) and overall survival (p 
0.005) for group A2 than for group Al tumors. No such 
subgrouping could be done in B. 
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Genes responsible for group A substructure were 
searched. These are potentially relevant to the prognosis 
and the sensitivity to chemotherapy in these tumors. Thirty- 
two genes out of 18 8 were identified by comparing their 
5 median expression level in Al vs A2 . Then, the 12 tumors were 

reclustered using the expression profiles of these genes as 
shown in Fig . 3 . The same subgroups Al and A2 were evident 
and separated by 2 groups of genes: as expected, high 
expression of ERBB2 , MYC and EGFR was associated with bad 

10 prognosis subgroup Al (6-8) , and that of E-cadherin and the 

proto- oncogene MYB with good prognosis subgroup A2 (9, 10). 
For most of the other genes, these results may stimulate new 
investigations. Differentiation state is a good prognostic 
factor in breast cancer and, accordingly, genes associated 

15 with cell differentiation, such as GATA3 (11) and CRABP2 

(12) , had a high level of expression in the better outcome 
group. The high expression of Ephrin-Al mRNA in the bad 
prognosis subgroup suggests a role of this growth factor in 
breast cancer and can be paralleled with its up-regulation 

2 0 during melanoma progression (13) . 

Differential gene expression between normal 
breast and breast tumors 

To identify genes differentially expressed 
25 between breast tumors (T) and normal breast (NB) , the NB 

value for each gene was compared to its expression level in 
each tumor. When the expression level of a gene in NB was 
undetectable, only qualitative information could be deduced 
and the mRNA was considered as differentially expressed if 

3 0 the signal intensity in the tumor was superior to the 

reproducibility threshold (0.002% of mRNA abundance) . In the 
other cases, differential expression was defined by an at 
least 2-fold expression difference. Also, the number of 
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tumors where it was over- or underexpressed was measured. 
Table 2 shows a list of the top 2 0 over- and underexpressed 
genes. For these genes, the T/NB ratio is reported, where T 
represented their median expression value in the 34 tumors. 
This ratio ranged from 2.70 (ABCC5) to 17.76 (GATA3) for the 
overexpressed genes, and from 0.00 (desmin) to 0.29 (APC) for 
the underexpressed genes. 



TABLE 2 



Clone 
ID 


Gene/Protein 
identity 


Gene 
symbol 


Chrom . 
location 


N 


T/NB 




Overexpressed 










154343 


Granzyme H 


GZMH 


14qll.2 


32 


9,51 


235947 


Stromelysin 3 


STMY3 


22qll .2 


3X 


15 92 


207378 


MYB Related 

^ lUuClii 13 


MYBL2 


20ql3.1 


31 


(a) 


153275 


Cellular Retinoic 
Acid Binding 


CRABP2 


lq21.3 


29 


7, 16 


129757 


GATA- binding 
protein 3 


GATA 3 


10pl5 


28 


17, 76 


120649 


T-Lyraphocyte 
surface CD 2 
antigen 


CD2 


lpl3.1 


28 


7, 54 


109677 


CREB Binding 
Protein 


CREBBP 


16pl3.3 


28 


5, 08 


172152 


EGFR-binding 
protein GRB2 


GRB2 


17q24-q25 


28 


5, 00 


66969 


Transcription 
factor RE LB 


RELB 


19 


28 


3,61 


182007 


ETS-Related 
Transcription 
Factor ELF1 


ELF1 


13ql3 


27 


3,58 


153446 


LIM domain protein 
RIL 


RIL 


5q31.1 


26 


4, 03 


203394 


ETS Variant gene 5 
(ETS-related 
molecule) 


ETV5 


3q28 


25 


3, 67 


160963 


Thrombospondin 1 


THBS1 


15ql5 


25 


3,39 


188393 


POU domain, class 
2, transcription 
Factor 2 


POU2F2 


19 


24 


4, 02 
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ID 


Gene/Protein 
identity 


Gene 
symbol 


Chrom . 
location 


N 


T/NB 


187822 


Integrin, beta. 2 


ITGB2 


21q22.3 


24 


3, 01 


243907 


Nuclear Factor of 
Activating T cell 
Subunit p45 


NF4 5 


1 


24 


2, 84 


158347 


EST H27202 


EST 




23 


2, 91 


230933 


EST AW184517 


EST 




22 


2, 85 


21236S 


ATP-Binding 
Cassette, sub- 

i. ctiuj. -L y v— 

(CFTR/MRP) , 5 


ABCC5 


3q27 


22 


2, 70 


149401 


Cathepsin D 


CTSD 


llpl5.5 


21 


2,97 




Underexpressed 
genes 










153 8 54 


Desmin 


DES 


2q35 


34 


0,00 


208717 


P55-C-FOS proto- 
oncogene protein 


FOS 


14q24 . 3 


33 


0 , 05 


159093 


Transcription 
Factor AP4 


TFAP4 


16pl3 


33 


0, 11 


124340 


Tenascin XA 


TNXA 


6p21.3 


33 


0, 14 


133738 


Prolactin 


PRL 


6p22.2-p21.3 


32 


0, 00 


133891 


Chorionic 
S oma t omammo t r op i n 
Hormone 1 


CSH1 


17q22-q24 


32 


0, 00 


151501 


Tyrosine Kinase 
Receptor TEK 


TEK 


9p2 1 




0 0 0 


183030 


Activating 
Transcription 
Factor 3 


ATF3 


i 


32 


0, 07 


120916 


Phosphodiesterase 
I 


PDNP2 


8q24.1 


32 


0, 14 


155716 


EST R72075 


EST 




31 


0, 00 


208118 


Transforming 
Growth Factor Beta 
Receptor Type III 


TGFBR3 


Ip33-p32 


31 


0, 14 


187547 


Diphtheria Toxin 
Receptor 


DTR 


5q23 


31 


0, 17 


108490 


HIV-1 Rev Binding 
protein 


HRB 


2c[3 6 


3 1 


0 20 


147002 


B-cell 
CLL / lymphoma 2 


BCL2 


18cj2 1 . 3 


3 1 


0 26 


182610 


Microsomal 
Glutathione S 
Transferase 1 


MGST1 


12pl2.3-pl2 .: 


L 31 


0,28 


152802 


Phospholipase A2 
Membrane 


PLA2G2J 


\ lp3 5 


30 


0, 03 
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Clone 
ID 


Gene/Protein 
identity 


Gene 
symbol 


Chrom . 
location 




T/NB 




Associated, group 
IIA 










183087 


Interleukin 3 
Receptor Alpha 
chain 


IL3RA 


Xp22.3;Ypl3.3 


30 


0, 24 


108571 


Retinoblastoma- 
Like 2 (pl30) 


RBL2 


16gl2.2 


29 


0,28 


125294 


Adenomatous 
Polyposis Coli 
Protein 


APC 


5q21-q22 


29 


0,29 


151767 


FASL Receptor 


TNFRSP6 


10q24 .1 


28 


0,27 



List of the genes that show the most frequent 
differential expression between normal breast tissue and 34 
breast carcinomas as measured by cDNA array analysis. N 
indicates the number of tumor samples where the gene is 
dysregulated (fold change > 2) compared to normal breast 
tissue. T/NB represents the ratio: median expression level 
in 34 breast tumors / expression level in normal breast. (a) 
MYBL2 transcript displayed a median expression level of 
0.025% in breast tumors and was undetectable in NB. 

High expression of mucin 1, NM2 3, ERBB2 , FGFR1 
and FGFR2 , MYC, stromelysin3 , cathepsin D and downregulation 
of FOS, APC, RBL2 , FAS, BCL2 were found, reflecting what is 
known about their biology in cancer. GATA3 , which codes for 
a member of the GATA family of zinc finger transcription 
factors, and CRABP2 , encoding one of the two cellular 
retinoic acid-binding proteins, showed high expression of 
mRNA, extending previous results on cDNA arrays (4) . 

Differential gene expression among various breast 
tumors and correlation with histoclinical prognostic 
parameters 
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To search for potential prognostic markers in 
breast cancer, genes with expression levels correlated with 
conventional histoclinical prognostic parameters were looked 
for: age of patients, axillary node status, tumor size, 
histological grade and ER status. Mo significant correlation 
was found with age, tumor size and histological grade. 
However, the expression profiles of some genes correlated 
with ER status and axillary node involvement. 

To identify genes potentially relevant to the 
hormone -responsive phenotype, the gene expression profiles in 
ER-positive breast cancers (n = 23) vs ER-negative breast 
cancers (n = 11) were compared. Sixteen clones displayed a 
median intensity of 0 in both groups. Twenty-five presented 
a fold change superior to 2. Table 3a displays the top 10 
over- and underexpressed genes. Among them, the most 
differentially expressed was GATA3 with a median intensity 
ratio ER+/ER- of 28.6 and a value for the first quartile of 
ER-positive tumors superior (5-fold) to the value of the 
third quartile of the ER-negative tumors as shown in Fig. 4a. 
The high expression of GATA3 in ER-positive tumors was 
statistically significant using a Mann-Witney test (p 
0.001). All ER-positive tumors and only 18% of ER-negative 
tumors displayed a GATA3 expression level greatly" superior 
(fold change > 3) to the normal breast value. Furthermore 
GATA3 expression was analyzed by Northern blot hybridization 
(Fig. 4b) in a panel of 79 breast cancers (21 ER-negative 
tumors and 58 ER-positive tumors) , including 22 of the tumors 
analyzed with cDNA arrays. It confirmed the array results 
for those 22 tumors as well as the strong correlation between 
ER status and GATA3 RNA expression (Mann-Witney test, p < 
0.0001) . 
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TAB LB 3A 



Clone ID 


Gene/Protein identity 


Gene 
symbol 


ER+/ER- 


129757 


GATA-binding protein 3 


GATA3 


28, 6 


356763 


Granzyme A 


GZMA 


5,7 


248613 


MYB proto-oncogene 


MYB 


3,4 


211999 


RTAA1075 protein 


KIAA1075 


3,3 


235947 


Stromelysin 3 


STMY3 


3,1 


229839 


Macrophage Stimulating 1 


MST1 


2,8 


153275 


Cellular Retinoic Acid 
Binding Protein 2 


CRABP2 


2,7 


301950 


X-box Binding Protein 1 


XBP1 


2,7 


205314 


Tumor Protein p53 


TP53 


2,5 


126233 


Insulin- like Growth 
Factor 2 


IGF2 


2,4 


66322 


CD3G antigen, Gamma 


CD3G 


0 0 


195022 


Interleukin 2 Receptor 
Gamma chain 


IL2RG 


0,0 


111461 


SOX4 Protein 


SOX4 




151475 


Epidermal Growth Factor 
Receptor 


EGFR 


0,5 


195022 


Interleukin 2 Receptor 
Beta chain 


IL2RB 


0,5 


130788 


Topoisomerase (DNA) II 
beta (180kD) 


TOP2B 


0, 6 


323948 


S0X9 Protein 


S0X9 


0, 6 


183641 


S100 calcium-binding 
protein Beta 


S100B 


0,6 


246620 


EST N53133 


EST 


0,6 


231424 


Glutathione S Transferase 
Pi 


GSTP1 


0,6 



To search for genes whose expression profile was 
correlated with axillary lymph node status, a strong 
prognostic factor in breast cancer, the group of node- 
negative tumors (n = 19) was compared with the group of 
tumors with massive axillary extension (10 or more positive 
nodes) . Furthermore, because survival decreases with the 
increase of the number of tumor -involved lymph nodes and 
because the expression measurements were quantitative, it was 
looked for a correlation between the expression levels of 
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these genes and the number of tumor- involved nodes 
(quantitative variables) . Table 3b shows a list of the top 
10 over- and underexpressed genes between these 2 groups. 
Most of these genes have not been previously reported as 
5 associated with node status, but some of these results are in 

agreement with literature data. The gene encoding the 
tyrosine kinase receptor ERBB2 was the most significantly 
overexpressed gene in node-positive tumors and displayed the 
highest correlation coefficient (r = 0.68 ; p < 0.0001). 
10 TABLE 3B 



Clone 
ID 


fHon a /Prnt" pi n i* rSp^n t* "i t" v 

^jcllti/ rlULclli mciiLity 


Gene symbol 


N-/10N+ 


129757 


GATA-binding protein 3 


GATA3 


11, 0 


160963 


Thrombospondin 1 


THBS1 


6,6 


151475 


Epidermal Growth Factor Receptor 


EGFR 


5,4 


120916 


Phosphodiesterase I 


PDNP2 


4 , 9 


183030 


Activating Transcription Factor 3 


ATF3 


4,6 


211999 


KIAA1075 protein 


KIAA1075 


4,5 


110480 


Nuclear Factor 1 A- type 


NF1A 


4,5 


182264 


P-Selectin 


SELP 


4,4 


356763 


Granzyme A 


GZMA 


4,3 


214008 


E-cadherin 


CDH1 


4,0 


147016 


ERBB2 Receptor Protein-Tyrosine Kinase 


ERBB2 


0,2 


179197 


Protein Phosphatase PP2A, 55 kD Subunit 


PP2A BR 
gamma 


0,2 


231424 


Glutathione S Transferase Pi 


GSTP1 


0,4 


111461 


SOX4 Protein 


SOX4 


0,4 


195022 


Interleukin 2 Receptor Beta chain 


IL2RB 


0,4 


220451 


Zinc Finger protein 144 


ZNF144 


0,5 


125413 


Mucin 1 


MUC1 


0,6 


290007 


CD44 antigen, epithelial form 


CD44 


0,6 


108571 


Retinoblastoma-Like 2 (pl3 0) 


RBL2 


0,7 


130788 


Topoisomerase (DNA) II Beta (180kD) 


T0P2B 


0,7 



Gene clusters 

Gene clustering from Fig. 2b showed groups of 
genes with correlated expression across samples . When 
15 different clones represented the same gene, they were 
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clustered next to each other (red arrows) . Correlation 
coefficients between gene pairs in the 34 tumors were often 
high (1% of the 13,041 gene pairs showed a correlation 
coefficient superior to 0.95 - not shown). An example of 
highly correlated gene expression is that of BCL2 and RBL2 . 
Such correlated expression, although it has not been 
described in the literature, probably reflects a common 
mechanism of regulation for these two genes. Furthermore, 
these genes also exhibited significant correlated expression 
with other genes such as PPP2CA, AKT2 , PRKCSH or TNFRSF6 /FAS . 
In particular, a striking correlated expression between BCL2 
and FAS could be observed (r = 0.91; data not shown). The 
exact meaning of this correlation is unknown, although it may 
reflect the necessary balance between apoptosis and anti- 
apoptosis for cell survival. 

Although in human cancer the proportion of 
changes that is reflected at the RNA level is not known, 
monitoring gene expression patterns appears as a very 
promising way of increasing the knowledge of the disease. 
Several different types of cancer have been investigated 
using cDNA arrays: cervical (14), hepatocellular (15), 
ovarian (16), colon (17) and renal carcinomas (18), 
glioblastomas (19), melanomas (20) (21), rhabdomyosarcomas 
(22), acute leukemias (23) and lymphomas (24). In breast 
cancer, pioneering studies have yielded the first expression 
patterns (4, 25-31). They have in particular addressed the 
important issue of molecular differences in hormone 
responsive and non- responsive breast tumors. Thus, Yang et 
al. (28) and Hoch et al . (25) compared expression profiles of 
breast carcinoma cell lines known to represent these two 
categories and identified a few genes with differential 
expression. One of these genes was GATA3 . In these studies, 
cell lines were mostly used and tumor samples were rarely 
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tested and generally in small numbers. The first study 
analyzing the expression profiles of a large series of breast 
cancers was published recently (32) , but no correlation with 
clinical outcome was mentioned. 
5 Several interesting points can be made based on 

the present experimentation. First, the differences in 
expression patterns among the tumors provided molecular 
transcriptional evidence of the histoclinical heterogeneity 
of breast cancer. This diversity was multifactorial, linked 
L0 to many different genes, highlighting the interest of high 

throughput analysis in this context. It was possible, with a 
hierarchical clustering program integrating the expression 
profiles, to separate normal breast tissue from most tumors 
and, moreover, to identify two different groups of tumors. 
15 Most importantly, two different subgroups of tumors with a 

very distinct clinical outcome that could not be predicted 
with classical prognostic factors have been identified by 
clustering. Indeed, all these tumors had a theoretically bad 
prognosis as evaluated by current histoclinical tools. All 
20 these patients would be at the present time treated with 

adjuvant chemotherapy, but without the capacity for the 
physicians to identify patients who will benefit of this 
treatment and those who will not benefit. 

Gene expression profiles were able to make this 
25 discrimination. Such predictive tools have important 

therapeutic implications. Patients with features of poor 
prognosis are candidates for other treatment than standard 
chemotherapy, avoiding loss of time and toxicities related to 
first-line chemotherapy. These results suggest that the 
30 histoclinical category of poor prognosis breast cancer, 

currently treated with adjuvant anthracyclin-based 
chemotherapy, groups together at least two molecularly 
distinct subgroups of tumors with different outcome which 
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would require distinct chemotherapy regimens. Expression 
profiles could thus provide a new and more accurate way of 
classifying breast tumors of poor prognosis and managing 
patients. 

Similarly, despite molecular heterogeneity, 
significant correlations between the expression level of 
genes ( GATA3 , ERBB2) and histological tumor parameters were 
identified. The ER-positivity in breast cancer has been 
correlated with tumor differentiation, low proliferating 
rate, favorable prognosis and response to hormonal therapy. 
The relation between hormone sensitivity of breast cancer and 
ER status is not perfect, and it is possible that some genes 
related to ER expression are more important than ER to 
characterize the hormone sensitive phenotype . These genes 
could serve as predictive factors to guide the therapy. 

GATA3 mRNA expression was highly correlated with 
ER status. GATA3, which is not estrogen-regulated (25), is a 
transcription factor that could regulate the expression of 
genes involved in the ER-positive phenotype. Among the other 
genes that were found associated with ER status during the 
experimental work leading to the present invention, some, 
such as MYB (10), stromelysin 3 (33), and CRABP2 (34), have 
been previously reported expressed at high levels in ER- 
positive breast tumors. The higher levels of TP53 mRNA in 
ER-positive tumors studied were surprising, although in 
agreement with a recent study (27) . Most studies concerning 
TP5 3 expression analyzed the protein level rather than the 
mRNA level, and TP53 protein levels are classically 
negatively correlated with the ER status (35). The high 
expression of CRAB P 2 could be. related to the better 
differentiated status of the ER-positive tumors. The low 
expression of the three immunity- related genes IL2RB, IL2RG 
and CD3G may be related to the low lymphoid infiltration in 
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these well differentiated tumors. ERBB2 high expression in 
breast cancer has been associated with a poor prognosis and 
some resistance to hormonal therapy and chemotherapy (36) . 
It is involved in the regulation of cellular differentiation, 
5 adhesion, and motility. The motility-enhancing activity of 

ERBB2 (37) could be responsible for the increased metastatic 
potential and the unfavorable prognosis of the breast tumors 
that overexpress ERBB2 . The low expression of E-cadherin and 
thrombospondin 1 in node-positive tumors are consistent with 
10 their putative role in different steps of metastatic spread: 

E-cadherin is an epithelial cell adhesion molecule whose 
disturbance is a prerequisite for the release of invasive 
cells in carcinomas (3 8) and thrombospondin 1 inhibits 
angiogenesis (39) . Similarly, the high expression of the 
15 molecule surface antigen Mucin 1 in node-positive tumors (40) 

can reduce cell-cell interactions facilitating cell 
detachment and metastasis. CD44, encoding a transmembrane 
glycoprotein involved in cell adhesion and lymph node homing 
(41) was expressed at high levels in node-positive tumors as 
20 well as GSTP1 (Glutathione-S-Transf erase Pi) , recently 

reported associated with increased tumor size (27) . 

Second, there were a number of genes with highly 
correlated expression patterns. Gene correlations have 
already been reported with larger series of genes, 
25 essentially under dynamic experimental conditions (42) and 

recently in steady states (17) . Here, correlations were 
based on expression profiles of a relatively small but 
selected series of genes and in steady states represented by 
different breast tumors. Gene correlations are potentially 
3 0 useful tools for cancer research in two ways: i) - they can 

provide information about the general regulation circuitry of 
a cancerous cell, allowing the identification of regulatory 
elements controlling expression networks; ii) - they offer the 
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possibility of reducing the complexity of the system analyzed 
by replacing, for example, the intensities of a large number 
of genes present in a gene cluster by their respective mean 
intensities . 

Finally, these results highlight the great 
potential of cDNA array in cancer research. The gene 
expression profiles confirmed the heterogeneity of breast 
cancer, and most importantly allowed us to identify, among a 
series of poor prognosis breast tumors, two subtypes of the 
disease not yet recognized with usual histoclinical 
parameters but with a different clinical outcome after 
adjuvant chemotherapy. Furthermore, the present invention 
allows detecting genes of which expression was correlated 
with classical prognostic factors. 

Table 4 displays a library of polynucleotides SEQ 
ID NO : 1 to SEQ ID NO : 468 corresponding to a population of 
polynucleotide sequences underexpressed or overexpressed in 
cells derived from tumors, more particularly breast tumors, 
and their respective complements. 
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TABLE 4 



Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 1 


Ref 


HRB 


1 


hiv-1 rev binding protein 


SEQ ID 
No:l 




SEQ ID 
No: 2 


GATA1 


2 


gata-binding protein 1 {globin 
transcription factor 1) 




No: 3 


SEQ ID 
No: 4 


TLK2 


3 


tousled-like kinase 2 




SEQ ID 
NO: 5 


SEQ ID 
No: 6 


EST 
T81919 


4 


ests, weakly similar to alu7_human 
alu subfamily sq sequence 
contamination warning entry 
[h. sapiens] 


SEQ ID 
No: 7 


SEQ ID 
NO: 8 




CCND1 


5 


cyclin dl (pradl : parathyroid 
adenomatosis 1) 


SEQ ID 
No: 9 




SEQ ID 
No: 10 


STAT1 


6 


signal transducer and activator of 
transcription 1, 91kd 




SEQ ID 
No: 11 


SEQ ID 
No : 12 


FGFR2 


7 


fibroblast growth factor receptor 2 
(bacteria- expressed kinase, 
keratinocyte growth factor 
receptor, craniofacial dysostosis 

1, crouzon syndrome, pfeiffer 
syndrome, jackson-weiss syndrome) 


SEQ ID 
No: 13 


SEQ ID 
No: 14 


SEQ ID 
No: 15 


EST 
T89980 


8 


ests 


SEQ ID 
No: 16 






PPP3CC 


9 


protein phosphatase 3 (formerly 
2b) , catalytic subunit, gamma 
isoform (calcineurin a gamma) 


SEQ ID 
No : 17 


SEQ ID 
No : IB 


SEQ ID 
NO : 19 


EST 
T90726 


10 


ests 


SEQ ID 
No:20 


SEQ ID 
NO: 21 




SOX4 


11 


sry (sex determining region y) -box 
4 


SEQ ID 
No: 22 


SEQ ID 

No: 23 


SEQ ID 
No : 24 


RNF5 


12 


ring finger protein 5 




SEQ ID 
No: 25 


SEQ ID 
No : 2 6 


AXL 


13 


axl receptor tyrosine kinase 


SEQ ID 
No:27 


SEQ ID 
No: 28 


SEQ ID 
No: 29 


CTSB 


14 


cathepsin b 




SEQ ID 
No:30 


SEQ ID 
No: 31 


PPP4C 


15 


protein phosphatase 4 (formerly x) , 

uaLdiyLIt, jDli.JJU.lix u 


SEQ ID 
No : 32 


SEQ ID 
No: 33 


SEQ ID 
No : 34 


EST 
T79867 


16 


ests 


SEQ ID 
No: 35 






FGFR4 


17 


fibroblast growth factor receptor 4 


SEQ ID 
No: 36 


SEQ ID 
No: 37 


SEQ ID 
No: 38 


ENPP2 


18 


ectonucleotide 
pyrophosphatase/phosphodiesterase 2 


SEQ ID 
No: 39 


SEQ ID 
No: 40 


SEQ ID 
No: 41 
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46 



Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 1 


Ref 








(autotaxin) 










RELA 


19 


v-rel avian reticuloendotheliosis 
viral oncogene homolog a (nuclear 
factor of kappa light polypeptide 
gene enhancer in b-cells 3 (p65)) 


SEQ ID 
No: 42 




SEQ ID 
No: 43 




I TIC 


20 


il2-inducible t-cell kinase 




SEQ ID 
No: 44 


SEQ ID 
No: 45 




TNXB 


21 


tenascin xb 




SEQ ID 
No: 46 


SEQ ID 
No: 47 




CSF1 


22 


*— w -u'JiAj' oi-iiLiuxctLiny J-a.CC.02T X 

(macrophage) 


SEQ ID 
No: 48 


SEQ ID 
No: 49 


SEQ ID 
No: 50 




VIL2 


23 


villin 2 (ezrin) 


SEQ ID 
No: 51 


No: 52 


SEQ ID 
No: 53 


APC 


24 


adenomatosis polyposis coli 


SEQ ID 
No: 54 


SEQ ID 
No: 55 


SEQ ID 
No: 56 




MUC1 


25 


mucin 1 , transmembrane 




SEQ ID 
No: 57 


SEQ ID 
No: SB 




IGF2 


26 


insulin- like growth factor 2 
(somatomedin a) 


SEQ ID 
No: 59 


SEQ ID 
No: 60 


SEQ ID 
No: 61 




EMR1 


27 


egf-like module containing, mucin- 
like, hormone receptor-like 
sequence 1 


SEQ ID 
No : 62 


SEQ ID 
No: 63 


SEQ ID 
No: 64 


KIAA042 

7 


28 


kiaa0427 gene product 


SEQ ID 
No: 65 


SEQ ID 

No : 66 


SEQ ID 
No: 67 


SYK 


29 


spleen tyrosine kinase 


SEQ ID 
No: 68 


SEQ ID 
NO:69 


SEQ ID 
No: 70 


IL7R 


30 


interleukin 7 receptor 




SEQ ID 
No: 71 


SEQ ID 
No: 72 


MYC 


31 


v-myc avian mvelnrvfrnnaf-ncn c ^-.i 
oncogene homolog 


SEQ ID 
No: 73 


No: 74 


SEQ ID 
No: 75 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


GRB7 


33 


QrOWth fart nr T~(=>r , j=» , ni- nr "Kmn-k^ 

protein 7 


SEQ ID 
No: 79 


SEQ ID 
No: 80 


SEQ ID 
No: 81 


TOP2B 


34 


topoisomerase (dna) ii beta (180kd) 




SEQ ID 
No: 82 


SEQ ID 
No: 83 


CASP4 


35 


CSSD61S6 4 annnt"nc!i c _-tp"1 afoH 

cysteine protease 


SEQ ID 
No: 84 




SEQ ID 
No: 85 


TIMP2 


36 


tissue inhibitor of 
raetalloproteinase 2 




SEQ ID 
No: 86 


SEQ ID 
No: 87 


DDT 


37 


d-dopachrome tautomerase 


SEQ ID 
No: 88 


SEQ ID 
No: 89 


SEQ ID 
No: 90 


PRL 


38 


prolactin 


SEQ ID 


SEQ ID 


SEQ ID 
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Gene 
symbol 


SET 
Ho 


Name 


Seq3 ' 


Seq5' 


Ref 








No: 91 


No: 92 


No: 93 


PRLR 


39 


prolactin receptor 


SEQ ID 
NO: 94 


SEQ ID 
No: 95 


SEQ ID 
No: 96 


IL2RB 


40 


interleukin 2 receptor, beta 


SEQ ID 
No: 97 


SEQ ID 
No: 98 


SEQ ID 
NO: 99 


GATA3 


41 


gata-binding protein 3 


SEQ ID 
No : 10 0 


SEQ ID 
No : 1 0 1 


SEQ ID 
No : 78 


PGF 


42 


placental growth factor, vascular 
endothelial growth factor- related 
protein 




SEQ ID 
No : 102 


SEQ ID 
No : 1 0 3 


UBE3A 


43 


ubiquitin protein ligase e3a (human 
papilloma virus e6-associated 
protein, angelman syndrome) 




SEQ ID 
No: 104 


SEQ ID 
NO: 105 


TC21 


44 


oncogene tc21 


SEQ ID 
No: 106 


SEQ ID 
No:107 


SEQ ID 
NO: 108 


TIE 


45 


tyrosine kinase with immunoglobulin 
and epidermal growth factor 
homology domains 




SEQ ID 
No: 109 


SEQ ID 
No: 110 


AMFR 


46 


autocrine motility factor receptor 


SEQ ID 
No: 111 


SEQ ID 
No: 112 


SEQ ID 
NO: 113 


EST 
R81127 


47 


homo sapiens mrna; cdna 
dkfzp434cl36 (from clone 
dkfzp434cl36) 


SEQ ID 
No: 114 






BCL2 


48 


b-cell cll/lymphoma 2 


SEQ ID 
No: 115 


SEQ ID 
No:116 


SEQ ID 
No:117 




4 9 


v-erb-b2 avian erythroblastic 
leukemia viral oncogene homolog 2 
(neuro/glioblastoma derived 
oncogene homolog) 




SEQ ID 
No:118 


SEQ ID 
No: 119 


MDM2 


50 


mouse double minute 2, human 
homolog of; p53 -binding protein 




SEQ ID 
NO: 12 0 


SEQ ID 
NO: 121 


GATA3 


51 


gata-binding protein 3 


SEQ ID 
No: 122 




SEQ ID 
No: 78 


HIP-55 


52 


src homology 3 domain- containing 
protein hip- 5 5 


SEQ ID 
No: 123 


SEQ ID 
No : 124 


SEQ ID 
NO:125 


CTSD 


53 


cathepsin d (lysosomal aspartyl 
protease) 


SEQ ID 
No: 126 


SEQ ID 
No: 127 


SEQ ID 
No:128 


IGF1R 


54 


insulin-like growth factor 1 
receptor 




SEQ ID 
No:129 


SEQ ID 
No:130 


INSR 


55 


insulin receptor 




SEQ ID 
No: 131 


SEQ ID 
No: 132 


FOX01A 


56 


forkhead box ola (rhabdomyosarcoma) 




SEQ ID 
No: 133 


SEQ ID 
No: 134 


EGFR 


57 


epidermal growth factor receptor 


SEQ ID 


SEQ ID 


SEQ ID 
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Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5' 


Ref 






(avian erythroblastic leukemia 
viral (v-erb-b) oncogene homolog) 


No: 13 5 


No: 13 6 


No : 1 3 7 


TEK 


58 


tek tyrosine kinase, endothelial 
(venous malformations, multiple 
cutaneous and mucosal) 


SEQ ID 
No: 13 8 


SEQ ID 
No: 139 


SEQ ID 
NO:140 


TNFRSF6 


59 


tumor necrosis factor receptor 
superfamily, member 6 


SEQ ID 
No: 141 


SEQ ID 
No: 142 


No: 143 


CDKN1A 


60 


CVCl in-derjRnHf=»nt- tr-in^co ^„u^u-if-s^ 

la (p21, cipl) 


SEQ ID 
No: 144 


SEQ ID 
No:145 


SEQ ID 
No: 146 


PLA2G2A 


61 


"DtlOSDilOl ina ^F* a9 nrnnn -i -i =i 

(platelets, synovial fluid) 


SEQ ID 
No: 147 


SEQ ID 
No: 148 


SEQ ID 
NO; 149 


GAPD 


62 


glyceraldehyde - 3 -phosphate 
dehydrogenase 


SEQ ID 
No: 150 


SEQ ID 
No:151 


SEQ ID 
No: 152 


JUNB 


63 


jun b proto- oncogene 


SEQ ID 
No: 153 


SEQ ID 
No: 154 


SEQ ID 
No: 155 


CRABP2 


64 


cellular retinoic acid-binding 
protein 2 


SEQ ID 
NO: 156 


SEQ ID 
No: 157 


SEQ ID 
No: 158 


ACVRL1 


65 


activin a receptor type ii-like 1 


SEQ ID 
No:159 


No:160 


SEQ ID 
NO: 161 


RIL 


66 


lim domain protein 




No:162 


SEQ ID 
No: 163 


SHC1 


67 


she (sre homology 2 domain - 
containing) transforming protein 1 




SEQ ID 
NO: 164 


SEQ ID 
No: 165 


GAPD 


68 


qlvceraldehvdp-1-nhnc!^h=ho 
dehydrogenase 


SEQ ID 
No: 166 


SEQ ID 
No: 167 


SEQ ID 
No: 152 


DES 


69 


desmin 


SEQ ID 
No:168 


No: 169 


SEQ ID 
No: 170 


CSNK2B 


70 


casein kinase 2, beta polypeptide 




SEQ ID 
No: 171 


SEQ ID 
No:172 


GLG1 


71 


golgi apparatus protein 1 


SEQ ID 
No: 173 


SEQ ID 
No: 174 


SEQ ID 
No: 175 


EDNRB 


72 


endothelin receptor type b 




SEQ ID 


SEQ ID 
No : 1 7 7 


GZMB 


73 


granzyme b (granzyme 2, cytotoxic 
t-lymphocyte-associated serine 
esterase 1) 


SEQ ID 
No: 178 




SEQ ID 
No: 179 


FGFR1 


74 


fibroblast growth factor receptor 1 
(fras- related tyrosine kinase 2, 
pfeiffer syndrome) 


SEQ ID 
No : 180 


SEQ ID 
No : 1 8 1 


SEQ ID 
No : 182 


PPP2CA 


75 


protein phosphatase 2 (formerly 
2a) , catalytic subunit, alpha 
isoform 




SEQ ID 
No: 183 


SEQ ID 
No: 184 


EST 
R55460 


76 


homo sapiens, clone image: 4 054 156, 
raRNA, partial cds 




SEQ ID 
No:185 
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Gene 
symbol 


SET 
No 




Seq3 ' 


Seq5 1 


Ref 


IGKC 


77 


t rnmi i n f—irr"! oVvi il t n Itattpi?} i""OTi ^1" an i~ 
XuUUUIiUy 1UUUJ.J.11 r^cxyycx iwUiiDuauL 


SEQ ID 
No: 186 






MC1R 


78 


melanocortin 1 receptor (alpha 
melanocyte stimulating hormone 
receptor) 




SEQ ID 
No: 187 


SEQ ID 
No: 188 


NRG1 


79 


neuregulin 1 


SEQ ID 
No: 189 


SEQ ID 
No: 190 


SEQ ID 
No: 191 


CNTFR 


80 


ciliary neurotrophic factor 
receptor 




SEQ ID 
No: 192 


SEQ ID 
No: 193 


ANG 


81 


angiogenin, ribonuclease, rnase a 
family, 5 




SEQ ID 
No: 194 


SEQ ID 
No: 195 


ENG 


82 


endoglin (osler-rendu-weber 
syndrome 1) 


SEQ ID 
No: 196 


SEQ ID 
No:197 


SEQ ID 
No: 198 


EGF 


83 


epidermal growth factor (beta- 
urogastrone) 


SEQ ID 
No: 199 




SEQ ID 
No:200 


HRMT1L1 


84 


hmtl (hnrnp methyltransf erase, s. 
cerevisiae) -like 1 


SEQ ID 
No:201 


SEQ ID 
No:202 


SEQ ID 
No: 203 


ETV4 


85 


ets variant gene 4 (ela enhancer- 
binding protein, elaf) 


SEQ ID 
No:204 


SEQ ID 
No:205 




ANXA11 


86 


annexin all 




SEQ ID 
No:206 


SEQ ID 
No:207 


PDGFRB 


87 


platelet-derived growth factor 
receptor, beta polypeptide 




SEQ ID 
No:208 


SEQ ID 
No:209 


WBSCR14 


88 


Williams -beuren syndrome chromosome 
region 14 




SEQ ID 
No:210 


SEQ ID 
NO: 211 


CD74 


89 


cd74 antigen (invariant polypeptide 
of major histocompatibility 
complex, class ii antigen- 
associated) 




SEQ ID 
No:212 


SEQ ID 
No:213 


ANXA7 


90 


annexin a7 




SEQ ID 
No:214 


SEQ ID 
No: 2 15 


THBS1 


91 


thrombospondin 1 


SEQ ID 
NO: 216 




SEQ ID 
NO:217 


PTPN2 


92 


protein tyrosine phosphatase, non- 
receptor type 2 


SEQ ID 
No: 218 


SEQ ID 
No: 219 


SEQ ID 
No: 220 


EPHA2 


93 


epha2 


SEQ ID 
No: 221 




SEQ ID 
No: 222 


TIMP1 


94 


tissue inhibitor of 
metalloproteinase 1 (erythroid 
potentiating activity, collagenase 
inhibitor) 


SEQ ID 
No: 223 


SEQ ID 
No -.224 


SEQ ID 
No:225 


EFNA1 


95 


ephrin-al 




SEQ ID 
NO:226 


SEQ ID 
No: 227 
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Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5 1 


Ref 


EDNRA 


96 


endothelin receptor type a 


SEQ ID 
No:228 




SEQ ID 
NO:229 


GRE2 


97 


growth factor receptor-bound 
protein 2 


SEQ ID 
No:230 


SEQ ID 
No: 231 


SEQ ID 
No: 232 


JUND 


98 


jun d proto- oncogene 


SEQ ID 
No : 2 3 3 




SEQ ID 
No : 2 3 4 


SMARCA2 


99 


swi/snf related, matrix associated, 
act in dependent regulator of 
chromatin, subfamily a, member 2 


SEQ ID 
No:235 


SEQ ID 
No:236 


SEQ ID 
No: 23 7 


PPP2R2C 


100 


protein phosphatase 2 (formerly 
2a), regulatory subunit b (pr 52), 
gamma isoform 


SEQ ID 
No:238 


SEQ ID 
No: 23 9 




THE S3 


101 


thrombospondin 3 


SEQ ID 
No:240 




SEQ ID 
No:241 


ACTG1 


102 


act in, gamma 1 


SEQ ID 
No: 242 


No: 243 


SEQ ID 
NO:244 


ITGA6 


103 


integrin, alpha 6 


SEQ ID 
No: 24 5 


No:246 


SEQ ID 
No:247 


RAD9 


104 


rad9 (s. pombe) homolog 


SEQ ID 
No: 24 8 




SEQ ID 
No:249 


ATF3 


105 


activating transcription factor 3 


SEQ ID 
No:250 


SEQ ID 
No: 251 


SEQ ID 
No:252 


AKT2 


106 


V-eLkfc murine t hvmnma vi ral r^-n nnnonn 

homolog 2 


SEQ ID 
No:253 




SEQ ID 
No:254 


S100B 


107 


SlOO Calciuffl-hn nri i nrr nrnt-oi'n 1,^1. * 
" ^ u ^wxvwxuiii uxiiuiiiy piULclH; jjgLa 

(neural) 




No:255 


SEQ ID 
No:256 


ABCB1 


108 


afcr)-bindina rsq^pfhp anh f am -n,. -u 
(mdr/tap) , member 1 


SEQ ID 
No:257 




SEQ ID 
NO: 258 


SELE 


109 


Selectin e f Pndnthpl i al arl}iac--i am 

molecule 1) 


SEQ ID 
No:259 


No:260 


SEQ ID 
No: 261 


EGF 


110 


epidermal growth factor (beta- 
urogastrone) 


SEQ ID 
No: 262 




SEQ ID 
No:200 


PRKCSH 


111 


protein kinase c substrate 80k-h 




SEQ ID 
No: 263 


SEQ ID 
No: 264 


DTR 


112 


diphtheria toxin receptor (heparin- 
binding epidermal growth factor- 
like growth factor) 




SEQ ID 
No: 265 


SEQ ID 
No: 266 


ITGB2 


113 


integrin, beta 2 (antigen cdl8 
(p95), lymphocyte function- 
associated antigen 1; macrophage 
antigen 1 (mac-1) beta subunit) 




SEQ ID 
No: 267 


SEQ ID 
No:268 


NEOl 


114 


neogenin (chicken) homolog l 




SEQ ID 
No:269 


SEQ ID 
No:270 1 


POU2F2 


115 


pou domain, class 2, transcription 


SEQ ID 




SEQ ID [ 
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Gene 
symt)oX 


SET 
No 


Name 


Seq3 ' 


Seq5 1 


Ref 






factor 2 


No:271 




No: 272 


BIRC4 


116 


oaculoviral iap repeat -containing 4 


SEQ ID 
No:273 




SEQ ID 
No: 274 


DAP 3 


117 


death associated protein 3 


SEQ ID 
No: 275 




SEQ ID 
No:276 


GNRH1 


118 


gonadotropin-releasing hormone 1 
(leutinizing-releasing hormone) 




SEQ ID 
No -.277 


SEQ ID 
No:278 


IL2RG 


119 


interleukin 2 receptor , gamma 
(severe combined immunodeficiency) 


SEQ ID 
No: 279 


SEQ ID 
No:280 


SEQ ID 
No: 281 


DAP 3 


120 


death associated protein 3 


SEQ ID 
No: 282 


SEQ ID 
No:283 


SEQ ID 

No: 276 


PTK2 


121 


ptk2 protein tyrosine kinase 2 




SEQ ID 
No: 284 


SEQ ID 
No:285 


CDK4 


122 


cyclin-dependent kinase 4 


SEQ ID 
NO:286 


SEQ ID 
No: 287 


SEQ ID 
No: 288 


BTF3 


123 


basic transcription factor 3 


SEQ ID 
No: 289 




SEQ ID 
No: 290 


CSF1R 


124 


colony stimulating factor 1 
receptor, formerly mcdonough feline 
sarcoma viral (v-fms) oncogene 
homolog 


SEQ ID 
No:291 




SEQ ID 
No: 292 


FLI 1 




f onrt 1 f^n'k-f^m'i p vi nis int*paration 1 


SEQ ID 
No:293 


SEQ ID 
No -.294 


SEQ ID 
No:295 


EST 
R97218 


126 


ests, highly similar to tvhume 
hepatocyte growth factor receptor 
precursor [h. sapiens] 


SEQ ID 
No:296 


SEQ ID 
No:297 




ETV5 


127 


ets variant gene 5 (ets-related 
molecule) 


SEQ ID 
No:298 


SEQ ID 
No: 2 99 


SEQ ID 
No: 300 


CDK4 


128 


cyclin-dependent kinase 4 


SEQ ID 
No:301 


SEQ ID 
No:302 


SEQ ID 
No:288 


YES1 


129 


v-yes-1 yamaguchi sarcoma viral 
oncogene homolog 1 


SEQ ID 
No: 3 03 




SEQ ID 
NO: 3 04 


IFI75 


130 


interferon- induced protein 75, 52kd 


SEQ ID 
No: 3 05 


SEQ ID 
No: 3 06 


SEQ ID 
No:307 


MYBL2 


131 


v-myb avian myeloblastosis viral 
oncogene homolog- like 2 


SEQ ID 
No:308 


SEQ ID 
No:309 


SEQ ID 
No:310 


TGFBR3 


132 


transforming growth factor, beta 
receptor iii (betaglycan, 300kd) 


SEQ ID 
No:311 


SEQ ID 
No: 312 


SEQ ID 
NO: 313 


PRDX2 


133 


peroxiredoxin 2 


SEQ ID 
No: 314 


SEQ ID 
No:315 


SEQ ID 
NO:316 


FOS 


134 


v-fos fbj murine osteosarcoma viral 
oncogene homolog 




SEQ ID 
No:317 


SEQ ID 
No -.318 
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Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 1 


Ref 


RBBP7 


135 


retinoblastoma-binding protein 7 


SEQ ID 
No:319 


SEQ ID 
No:320 


SEQ ID 
NO:321 


KIAA10" 
5 


13 6 


kiaal075 protein 


SEQ ID 
No: 322 


SEQ ID 
No: 323 




ABCC5 


137 


atp-binding cassette, sub-family c 
(cftr/mrp) , member 5 




SEQ ID 
No:324 


No -.325 


CDH1 


138 


cadherin 1, type 1, e-cadherin 
(epithelial) 


SEQ ID 
No:326 


SEQ ID 
No:327 


SEQ ID 
No:328 


ZNF144 


1-3 y 


zinc finger protein 144 (mel-18) 




SEQ ID 
No: 32 9 


SEQ ID 
No:330 


MST1 


14 0 


macrophage stimulating 1 
{hepatocyte growth factor -like) 


SEQ ID 
No: 331 


SEQ ID 
No:332 


SEQ ID 
No: 3 33 


GSTP1 


141 


glutathione s-transf erase pi 


SEQ ID 
No:334 


SEQ ID 
NO:335 


No:336 


BCL2 


142 


b-cell cll/lymphoma 2 


SEQ ID 
No:337 


SEQ ID 
No:338 


SEQ ID 
No: 117 


PCNA 




proliferating cell nuclear antigen 


SEQ ID 
No: 33 9 


SEQ ID 
No : 34 0 


No: 341 


BS69 


144 


adenovirus 5 ela binding protein 


SEQ ID 
No: 342 


SEQ ID 
No: 343 


SEQ ID 
No: 344 


MMP1 1 


145 


matrix metalloproteinase 11 
(stromelysin 3) 


SEQ ID 
No: 345 




SEQ ID 
No: 346 


MGC13 07 
1 




hypothetical protein mgcl3071 


SEQ ID 
No:347 


SEQ ID 
No:348 


SEQ ID 
No: 349 


ILF2 


14 7 


interleukin enhancer binding factor 
2, 45kd 




SEQ ID 
No:350 


SEQ ID 
No:351 


FLJ113 0 
7 


148 


hypothetical protein flj 11307 


SEQ ID 
No: 352 




SEQ ID 
No: 353 




MYB 


149 


v-myb avian myeloblastosis viral 
oncogene homolog 




SEQ ID 
No : 354 


SEQ ID 
No:355 




ZNF9 


150 


zinc finger protein 9 (a cellular 

retroviral nnrlpiV a^iH k-; i-.^-.' ^„ 
^ v -*- j - t *- t - uubicit cti^xu -Dj.no.ing 

protein) 


SEQ ID 
No:356 




SEQ ID 
No: 357 




CREM 




camp responsive element modulator 


SEQ ID 
No: 358 


SEQ ID 
No: 3 59 


SEQ ID 
No:360 




CTSB 


152 


cathepsin b 


SEQ ID 
No:361 




SEQ ID 
NO: 31 




MLANA 


153 


melan-a 


SEQ ID 
No: 3 62 


SEQ ID 
NO:363 


SEQ ID 
No: 364 




APR-1 


154 


apr-1 protein 


SEQ ID 
No:365 


SEQ ID 
No: 3 66 


SEQ ID 
No:367 


1 ETV5 J 155 


ets variant gene 5 {ets-related 


SEQ ID 


SEQ ID 


SEQ ID 
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Gene 


SET 
No 


Name 


Seq3' 


Seq5' 


Ref 






molecule) 


No:368 


No:369 


NO-.300 


CD69 


156 


CQoy anx ly en. \pou , caiiy l- v^ci.j- 

activation antigen) 




SEQ ID 
No : 3 7 0 


SEQ ID 
No:371 


TC21 


157 


oncogene tc21 


SEQ ID 
No: 372 


SEQ ID 
No:373 


SEQ ID 
NO:108 


CD44 


158 


cd44 antigen (homing function and 
indian blood group system) 


SEQ ID 
No:374 


SEQ ID 
No: 375 


SEQ ID 
NO: 376 


CDKM3 


159 


cycl in- dependent kinase inhibitor 3 
(cdk2-associated dual specificity 
phosphatase) 


SEQ ID 
No: 377 


SEQ ID 
No:378 


SEQ ID 
NO:379 


MXI1 


ISO 


max-interacting protein 1 




SEQ ID 
No:380 


SEQ ID 
No:381 


H0XA5 


161 


homeo box a5 


SEQ ID 
NO:382 


SEQ ID 
NO-.383 


SEQ ID 
No: 3 84 


XBP1 


1 62 


x-box binding protein 1 


SEQ ID 
No:385 


SEQ ID 
No:386 


SEQ ID 
No: 3 87 


TNFAIP3 


163 


tumor necrosis factor, alpha- 
induced protein 3 


SEQ ID 
No : 3 8 8 


SEQ ID 
No:389 


SEQ ID 
No: 3 90 


SRF 


164 


serum response factor (c-fos serum 
response element -binding 
transcription factor) 


SEQ ID 
No : 3 9 1 


SEQ ID 
No : 3 92 


SEQ ID 
No : 3 93 


SOX9 


165 


sry (sex determining region y) -box 
9 (campomelic dysplasia, autosomal 
sex- reversal) 


SEQ ID 
NO:394 




SEQ ID 
No:395 


CDH15 


166 


cadherin 15, m-cadherin (myotubule) 


SEQ ID 
No:396 


SEQ ID 
No:397 


SEQ ID 
No: 3 98 


BCL2 


167 


b-cell cll/lymphoma 2 


SEQ ID 
NO:399 


SEQ ID 
No:400 


SEQ ID 
No: 117 


EST 
W73386 


168 


ests 


SEQ ID 
No:401 






GZMA 


169 


granzyme a (granzyme 1, cytotoxic 
t- lymphocyte -associated serine 
esterase 3) 


SEQ ID 
No: 402 




SEQ ID 
NO -.403 


FOS 


170 


v-£os £t)J [murine osteoscurcoma. viirs-X 
oncogene homolog 


SEQ ID 
No: 4 04 


SEQ ID 
No:405 


SEQ ID 
No: 3 18 


ILF1 


171 


i 


SEQ ID 
No:406 


SEQ ID 
No:407 


SEQ ID 
No:408 


ARHGDI2 


l 172 


rho gdp dissociation inhibitor 
(gdi) alpha 


SEQ ID 
No: 409 


SEQ ID 
No: 4 10 


SEQ ID 
No:411 


C4A 


173 


complement component 4a 


SEQ ID 
No -.412 




SEQ ID 
NO:413 


CD3G 


174 


cd3g antigen, gamma polypeptide 
(tit3 complex) 


SEQ ID 
No: 414 


SEQ ID 
No: 4 15 


SEQ ID 
No: 416 
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Gene 
symbol 


SET 
No 


Name 


Seg3 ' 


SegS 1 


Ref 


RELB 


175 


v-rel avian reticuloendotheliosis 
viral oncogene homolog b (nuclear 
factor of kappa light polypeptide 
gene enhancer in b-cells 3) 


SEQ ID 
No:417 


SEQ ID 
No: 418 


SEQ ID 
No: 419 


ESR1 


176 


estrogen receptor 1 


SEQ ID 
No:420 


SEQ ID 
No: 421 


SEQ ID 
No: 422 


PBX1 


177 


pre-b-cell leukemia transcription 
factor 1 


SEQ ID 
No:423 


SEQ ID 
No: 424 


SEQ ID 
No:425 


GLI3 


178 


gli-kruppel family member gli3 
(greig cephalopolysyndactyly 
syndrome ) 


SEQ ID 
No:426 


SEQ ID 
No: 427 


SEQ ID 
No:428 


ILF1 


179 


int eirX eu)ci n pnhanrpr hi nHin^ ^ _ _ j_ _ 

1 


SEQ ID 
No:429 




SEQ ID 
No: 408 


EST 
T80406 


180 


similar tr> ST> ■ nxKAft c^cc/ia 
RB2/P130 PROTEIN 


SEQ ID 
No : 430 






EST 
T95640 


181 


similar to gb:M16336 T-CELL SURFACE 
ANTIGEN CD2 


SEQ ID 
No: 431 






EST 
R28523 


182 


similar to placental lactogen 
(CSH1) 


SEQ ID 
No: 432 






ESTs 
& 

H21880 


183 


Homo sapiens plasminogen activator 
(PLAT) 


SEQ ID 
No: 433 


SEQ ID 
No: 434 




ESTs 
H24628 
& 

H24592 


184 


Homo sapiens aminoacylase 1 (ACY1) . 


SEQ ID 
No:435 


SEQ ID 
No:436 




EST 
H28056 


185 


Homo sapiens E74-like factor l (ets 
domain transcription factor) (ELF1) 


SEQ ID 
No:437 






ESTs 
H3 0141 
& 

H27466 


186 


Homo sapiens selectin P 


SEQ ID 
No:438 


SEQ ID 
No:439 




ESTs 
H42957 

Sc 

H42888 


187 


Human interleukin 3 receptor (hlL- 
3Ra) 


SEQ ID 
NO:440 


SEQ ID 
No:441 




EST 
H57912 


188 


Human tumor protein p53 (Li- 
Fraumeni syndrome) (TP53) 


SEQ ID 
No: 442 


SEQ ID 
No:443 
















ERBB2 


189 


v-erb-b2 avian erythroblastic 
leukemia viral oncogene homolog 2 
(neuro/glioblastoma derived 
oncogene homolog) (ERBB2) 


SEQ ID 
No: 444 






2NF144 


190 


zinc finger protein 144 (Mel- 18) 
(ZNF144) 


SEQ ID 
No: 445 
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Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5 ' 


Ref 


MARK3 


191 1 


viAP/microtubule affinity-regulating 
kinase 3 (MARK3) 


SEQ ID 
No: 44 6 


SEQ ID 
NO: 447 




EST 
N68536 


192 


EST N68536 MAX- interacting protein 
1 (MXI1) 


SEQ ID 
No: 448 






EST 
R81126 


193 


EST R81126 lymphotoxin beta 
receptor (LTBR) 




SEQ ID 
NO:449 




POU2F2 


194 


(POU2F2) 




SEQ ID 
No:450 




CAS PI 


195 


caspase 4, apoptosis-related 
cysteine protease (CASP4) (ex 
CASP1) 




SEQ ID 
No: 451 




HRB 


196 


syndecan 1 (SDC1) (ex HRB) 




SEQ ID 
No: 452 




ITGB2 


197 


integrin, beta 2 (antigen CD18 
(p95) , lymphocyte function- 
associated antigen 1; macrophage 
antigen 1 (mac-1) beta subunit) 
(ITGB2) 


SEQ ID 
No: 453 






MGST1 


198 


protein phosphatase 1, catalytic 
subunit, alpha isofortn (PPP.1CA) (ex 
MGSTl ) 




SEQ ID 
No: 454 






199 


protein phosphatase 2 (formerly 
2A) , catalytic subunit , alpha 
isoform (PPP2CA) 


SEQ ID 
No: 455 






sun 


200 


S100 calcium-binding protein All 
(calgizzarin) (S100A11) 




SEQ ID 
No:456 




GZMA 


201 


granzyme A (granzyme 1, cytotoxic 
T-lymphocyte-associated serine 
esterase 3) (GZMA) 




SEQ ID 
No:457 




EDN1 


202 


endothelin 1 (EDN1) 


SEQ ID 
No:458 






PTPN6 


203 


protein tyrosine phosphatase, non- 
receptor type 6 (PTPN6) 


SEQ ID 
No: 459 






TFAP4 


204 


transcription factor AP-4 
(activating enhancer binding 
protein 4) (TFAP4) 


SEQ ID 
No: 460 






CCND2 


205 


cyclin D2 (CCND2) 


SEQ ID 
No:461 






JUP 


206 


junction plakoglobin (JUP) 


SEQ ID 
No: 4 62 






GADD45/ 


i 207 


growth arrest and DNA-damage- 
inducible, alpha (GADD45A) 


SEQ ID 
No:463 






nm23 


208 


non-metastatic cells 1, protein 
(NM23A) expressed in (NME1) 


SEQ ID 
NO: 464 






BBC1 


209 


ribosomal protein L13 (RPL13) (ex 


SEQ ID 
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Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5 ' 


Ref 






BBCl) 


No:465 






VEGPB 


210 


vascular endothelial growth factor 
B (VEGPB) 


SEQ ID 
No : 4 6 6 






LAMR1 


211 


laminin receptor 1 (67kD, ribosoraal 
protein SA) (LAMR1) 


SEQ ID 
No : 467 






CSH1 


212 


Chorionic somatomammotropin hormone 
1 (placental lactogen) = LACTOGEN 
Precursor 




SEQ ID 
No: 468 





Tables 5A and 5B hereunder displays two 
subpopulations corresponding to the 5 top overexpressed and 
to the 5 top underexpressed polynucleotide sequences 
particularly interesting to distinguish healthy person from 
cancer patient. 



TABLE 5A 

overexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5' 


Ref 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
NO: 78 


GZMB 


73 


granzyme b (granzyme 2, cytotoxic 
t- lymphocyte-associated serine 
esterase 1) 


SEQ ID 
No: 178 




SEQ ID 
No: 179 


MYBL2 


131 


v-myb avian myeloblastosis viral 
oncogene homolog-like 2 


SEQ ID 
NO: 3 08 


SEQ ID 
No: 3 09 


SEQ ID 
No:310 


MMP11 


145 


matrix metalloproteinase n 
(stromelysin 3) 


SEQ ID 
No:345 




SEQ ID 
NO: 346 


EST 
T95640 


181 


similar to gb:M16336 T-CELL SURFACE 
ANTIGEN CD 2 


SEQ ID 
No:431 
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TABLE 5B 

underexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5' 


Ref 


PRL 


38 


prolactin 


SEQ ID 
No: 91 


SEQ ID 
No: 92 


SEQ ID 
No: 93 


TEK 


58 


tek tyrosine kinase, endothelial 
(venous malformations, multiple 
cutaneous and mucosal) 


SEQ ID 
No: 138 


SEQ ID 
No: 139 


SEQ ID 
No:140 


PLA2G2A 


61 


phospholipase a2 , group iia 
(platelets, synovial fluid) 


SEQ ID 
No: 147 


SEQ ID 
No: 148 


SEQ ID 
No: 149 


DES 


69 


desrain 


SEQ ID 
No: 16 8 


SEQ ID 
NO:169 


SEQ ID 
No: 170 


EST R28523 


182 


similar to placental lactogen (CSH1) 


SEQ ID 
No:432 







5 

Table 6 hereunder relate to sub populations of 
polynucleotide sequences interesting to detect hormone 
sensitive tumors allowing to distinguish between ER+ and ER- 
10 samples. 

TABLE 6 



Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5' 


Ref 


SOX4 


11 


sry (sex determining region y) -box 4 


SEQ ID 
No: 22 


SEQ ID 
No: 23 


SEQ ID 
No: 24 


IGF2 


26 


insulin-like growth factor 2 
(somatomedin a) 


SEQ ID 
No: 59 


SEQ ID 
No: 60 


SEQ ID 
No: 61 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


TOP2B 


34 


topoisomerase (dna) ii beta (180kd) 




SEQ ID 
No: 82 


SEQ ID 
No: 83 


IL2RB 


40 


interleukin 2 receptor, beta 


SEQ ID 
No: 97 


SEQ ID 
No: 98 


SEQ ID 
No: 99 


EGFR 


57 


epidermal growth factor receptor 
(avian erythroblastic leukemia viral 
(v-erb-b) oncogene homolog) 


SEQ ID 
No: 135 


SEQ ID 
No: 136 


SEQ ID 
No: 137 


CRABP2 


64 


cellular retinoic acid-binding proteir 
2 


SEQ ID 
No :156 


SEQ ID 
No: 157 


SEQ ID 
No: 158 
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Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 ' 


Ref 


S100B 


107 


slOO calcium-binding protein, beta 
(neural) 




SEQ ID 
No: 255 


SEQ ID 
No: 256 


IL2RG 


119 


interleukin 2 receptor, gamma (severe 
combined immunodeficiency) 


No:279 


SEQ ID 
No:280 


SEQ ID 
No:281 


KIAA107 
5 


136 


kiaal075 protein 


SEQ ID 
No: 322 


SEQ ID 
No: 323 




MST1 


140 


macrophage stimulating 1 (hepatocyte 
growth factor -like) 


SEQ ID 
No:33l 


SEQ ID 
No:332 


SEQ ID 
No:333 


GSTP1 


141 


glutathione s -transferase pi 


SEQ ID 
No: 3 34 


SEQ ID 
No:335 


SEQ ID 
No:336 


MMP11 


145 


innatrix metalloproteinase 11 
(stromelysin 3) 


No: 345 




SEQ ID 
NO:346 


FLJ113 0 
7 


148 


hypothetical protein flj 11307 


SEQ ID 
No:352 




SEQ ID 
NO:353 


MYB 


149 


v-myb avian myeloblastosis viral 
oncogene homolog 




SEQ ID 
No: 354 


SEQ ID 
NO:355 


XBP1 


162 


x-box binding protein 1 


SEQ ID 
No : 3 8 5 


SEQ ID 
No : 3 8 6 


SEQ ID 
No : 3 8 7 


SOX 9 


165 


sry (sex determining region y) -box 9 
(campomelic dysplasia, autosomal sex- 
reversal) 


SEQ ID 
No: 394 




SEQ ID 
No:395 


GZMA 


169 


granzyme a (granzyme 1, cytotoxic t- 
lymphocyte-associated serine esterase 
3) 


SEQ ID 
No: 402 




SEQ ID 
No: 403 


CD3G 


174 


cd3g antigen, gamma polypeptide (tit3 
complex) 


SEQ ID 
No: 414 


SEQ ID 
No:415 


SEQ ID 
No:416 


EST 
H57912 


188 


Human tumor protein p53 (Li-Fraumeni 
syndrome) (TP53) 


SEQ ID 
No: 442 







Tables 6A et SB hereunder relate to two sub 
populations of polynucleotide sequences particularly 
interesting to detect hormone sensitive tumors allowing to 
distinguish between ER+ and ER- samples 
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Table 6A 

overexpressed genes : top 5 



ER + / ER - 



symbol 


SET 
No 


Name 


Seq3 1 


Seq5' 


Ref 


GATA3 


32 


gata -binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


KIAA1075 


136 


kiaal075 protein 


SEQ ID 
No: 322 


SEQ ID 
No:323 




MMP11 


145 


matrix metalloproteinase 11 
(stromelysin 3) 


SEQ ID 
No: 345 




SEQ ID 
No: 346 


MYB 


149 


v-myb avian myeloblastosis viral 
oncogene homolog 




SEQ ID 
NO: 354 


SEQ ID 
No: 3 55 


GZMA 


169 


granzyme a (granzyme 1, 
cytotoxic t- lymphocyte - 
associated serine esterase 3) 


SEQ ID 
No:402 




SEQ ID 
No:403 



Table 6B 



underexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 1 


Ref 


SOX4 


11 


sry (sex determining region y) - 
box 4 


SEQ ID 
No: 22 


SEQ ID 
No:23 


SEQ ID 
No: 24 


IL2RB 


40 


interleukin 2 receptor, beta 


SEQ ID 
No: 97 


SEQ ID 
No: 98 


SEQ ID 
No:99 


EGFR 


57 


epidermal growth factor receptor 
(avian erythroblastic leukemia 
viral (v-erb-b) oncogene 
homolog) 


SEQ ID 
No: 135 


SEQ ID 
No: 136 


SEQ ID 
No: 137 


IL2RG 


119 


interleukin 2 receptor, gamma 
(severe combined 
immunodef i c i ency ) 


SEQ ID 
No: 279 


SEQ ID 
No:280 


SEQ ID 
No: 281 


CD3G 


174 


cd3g antigen, gamma polypeptide 
(tit3 complex) 


SEQ ID 
NO: 414 


SEQ ID 

No: 415 


SEQ ID 
No: 416 



10 

Tables 7 hereunder relates to subpopulations of 
polynucleotide sequences interesting to distinguish tumors 
with lymphe node from tumors with no lymphe node. 
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TABLE 7 



Gene 
symbol 


SET 
Wo 


Name 


Seq3 1 


Seq5' 


Ref 


EST T8998C 


8 


ests 


SEQ ID 
Wo: 16 






S0X4 


11 


sry (sex determining region y) -box 4 


SEQ ID 
Wo: 22 


SEQ ID 
No: 23 


SEQ ID 
No: 24 


ENPP2 


18 


ectonucleotide 

pyrophosphatase/phosphodiesterase 2 
(autotaxin) 


SEQ ID 
Wo : 39 


SEQ ID 
Wo : 40 


SEQ ID 
Wo : 41 


MUC1 


25 


mucin 1, transmembrane 




SEQ ID 
No:57 


SEQ ID 
No: 58 


GAT A 3 


32 


gata-binding protein 3 


SEQ ID 
NO: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


TOP2B 


34 


topoisomerase (dna) ii beta (180kd) 




SEQ ID 
No: 82 


SEQ ID 
No: 83 


IL2RB 


40 


iiiuciicuivxji a irecGpcojr, 


SEQ ID 
NO: 97 


SEQ ID 
No: 98 


SEQ ID 
No: 99 


ERBB2 


49 


v-erb-b2 avian erythroblastic 
leukemia viral oncogene homolog 2 
(neuro/glioblastoma derived oncogene 
homolog) 




SEQ ID 
No:118 


SEQ ID 
No: 119 


EGFR 


57 


epidermal growth factor receptor 
(avian erythroblastic leukemia viral 
(v-erb-b) oncogene homolog) 


SEQ ID 
No: 13 5 


SEQ ID 
No:136 


SEQ ID 
No: 137 


THBS1 


91 


thrombospondin 1 


SEQ ID 
NO: 216 




SEQ ID 
No : 2 1 7 


PPP2R2C 


100 


protein phosphatase 2 (formerly 2a) , 
regulatory subunit b (pr 52), gamma 
isoform 


SEQ ID 
No : 2 3 8 


SEQ ID 
No : 2 3 9 




ATF3 


105 


activating transcription factor 3 


SEQ ID 
Wo: 250 


SEQ ID 
No:251 


SEQ ID 
NO:252 


KIAA1075 


136 


kiaal075 protein 


SEQ ID 
No: 322 


SEQ ID 
NO:323 




CDH1 


138 


cadherin^ 1, type 1, e-cadherin 
(spith.elia.1 ) 


SEQ ID 
Wo : 326 


SEQ ID 
No:327 


SEQ ID 
No: 328 


ZNF144 


139 


— 

zinc finger protein 144 (mel-18) 




SEQ ID 
No : 3 2 9 


SEQ ID 
No:330 


GSTP1 


141 


3 lilt" at" h 1 Q-frsn^f ara co n -i 


SEQ ID 
NO:334 


SEQ ID 
Wo:335 


SEQ ID 
No:336 


CD44 


158 


cd44 antigen (homing function and 
indian blood group system) 


SEQ ID 
No: 374 


SEQ ID 
Wo: 375 


SEQ ID 
No: 376 


GZMA 


169 


granzyme a (granzyme 1, cytotoxic t- 
lymphocyte-associated serine esterase 
3) 


SEQ ID 
No: 402 




SEQ ID 
No:403 
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Gene 
symbol 


SET 
No 




Seq3 ' 


Seq5 ' 


Ref 


EST T80406 


180 


Similar to SP:S36648 S36648 RB2/P130 
PROTEIN 


SEQ ID 
No: 430 






ESTs 
H30141 & 
H27466 


186 


Homo sapiens selectin P 


SEQ ID 
No:438 


SEQ ID 
No: 439 





Tables 7A and 7B hereunder relate to two sub 
populations of polynucleotide sequences particularly 
5 interesting to distinguish tumors with lymphe node from 

tumors with no lymphe node. 

TABLE 7 A 



Overexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Slame 


Seq3 1 


Seq5 1 


Ref 


ENPP2 


18 


ectonucleotide 

pyrophosphata.se/phosphodies t era 
se 2 (autotaxin) 


SEQ ID 
No: 39 


SEQ ID 
No: 40 


SEQ ID 
NO: 41 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 

No: 78 


EGFR 


57 


epidermal growth factor 
receptor (avian erythroblastic 
leukemia viral (v-erb-b) 
oncogene homolog) 


SEQ ID 
No -.13 5 


SEQ ID 
No: 136 


SEQ ID 
No: 137 


THBS1 


91 


thrombospondin 1 


SEQ ID 
No: 216 




SEQ ID 
No:217 


ATF3 


105 


activating transcription factor 
3 


SEQ ID 
No: 250 


SEQ ID 
No:251 


SEQ ID 
No: 252 
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TABLE 7B 

Underexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5' 


Ref 


S0X4 


11 


sry (sex determining region 
y) -box 4 


SEQ ID 
No: 22 


SEQ ID 
NO: 23 


SEQ ID 
No: 24 


IL2RB 


40 


interleukin 2 receptor, beta 


SEQ ID 
NO: 97 


SEQ ID 
No: 98 


SEQ ID 
No: 99 
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Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5> 


Ref 


ERBB2 


49 


v-erb-b2 avian erythroblastic 
leukemia viral oncogene 
homolog 2 {neuro/glioblastoma 
derived oncogene homolog) 




SEQ ID 
No: 118 


SEQ ID 
No: 119 


PPP2R2C 


100 


protein phosphatase 2 
(formerly 2a) , regulatory 
subunit b (pr 52) , gamma 
isoform 


SEQ ID 
No:238 


SEQ ID 
No: 239 




GSTP1 


141 


glutathione s-transf erase pi 


SEQ ID 
No:334 


SEQ ID 
No:335 


SEQ ID 
No:336 



Tables 8, 8A and 8B hereunder relates to sub 
populations of polynucleotide sequences particularly 
interesting to distinguish tumors sensitive to antracycline 
from tumors unsensitive to antracycline. 

TABL E 8 



Al /A2 



Gene 
symbol 


SET 
No 


Name 


Seg3 1 


Seq5 ' 


Ref 


SOX4 


11 


sry (sex determining region y) -bo? 
4 


SEQ ID 
No: 22 


SEQ ID 
No: 23 


SEQ ID 
No: 24 


CSF1 


22 


colony stimulating factor 1 
(macrophage) 


SEQ ID 
No: 48 


SEQ ID 
No: 49 


SEQ ID 
No: 50 


VIL2 


23 


villin 2 (ezrin) 


SEQ ID 
No: 51 


SEQ ID 
No: 52 


SEQ ID 
No: 53 


IGF2 


26 


insulin-like growth factor 2 
(somatomedin a) 


SEQ ID 
No: 59 


SEQ ID 
No: 60 


SEQ ID 
No: 61 


KIAA0427 


28 


kiaa0427 gene product 


SEQ ID 
No: 65 


SEQ ID 
No: 66 


SEQ ID 
No: 67 


MYC 


31 


v-myc avian myelocytomatosis viral 
oncogene homolog 


SEQ ID 
NO: 73 


SEQ ID 
No: 74 


SEQ ID 
No: 75 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


TOP2B 


34 


topoisomerase (dna) ii beta (180kd) 




SEQ ID 
No: 82 


SEQ ID 
No: 83 


ERBB2 


49 


v-erb-b2 avian erythroblastic 
leukemia viral oncogene homolog 2 
(neuro/glioblastoma derived 
oncogene homolog) 




SEQ ID 
No: 118 


SEQ ID 
No: 119 


EGFR 


57 


epidermal growth factor receptor 
(avian erythroblastic leukemia 
viral (v-erb-b) oncogene homolog) 


SEQ ID 
No: 135 


SEQ ID 
No: 136 


SEQ ID 
NO: 137 
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Gene 
symbol 


SET 
No 


Name 


Seq3 1 


Seq5 ' 


Ref 


CRABP2 


64 


-*fallnils*r i~<=>t* "i nni r* apid-bindinQ 

Drotein 2 


SEQ ID 
No: 156 


SEQ ID 
No: 157 


SEQ ID 
No:158 


GZMB 


73 


t-lymphocyte-associated serine 
esterase 1) 


SEQ ID 
No:178 




SEQ ID 
No: 179 


IGKC 


77 


immunoglobulin kappa constant 


SEQ ID 
No:186 






ANG 


81 


angiogenin, ribonuclease , rnase a 
family, 5 




SEQ ID 
No: 194 


SEQ ID 
No: 195 


EFNA1 


95 


ephrin-al 




SEQ ID 
No: 226 


SEQ ID 
No:227 


MYBL2 


131 


v-myb avian myeloblastosis viral 
oncogene homolog-like 2 


SEQ ID 
No: 3 08 


SEQ ID 
No:309 


SEQ ID 
No:310 


CDH1 


138 


cadherin 1, type 1, e-cadherin 
(epithelial) 


SEQ ID 
No:326 


SEQ ID 
No:327 


SEQ ID 
No:328 


MST1 


140 


macrophage stimulating 1 
(hepatocyte growth factor-like) 


SEQ ID 
No:331 


SEQ ID 
No: 332 


SEQ ID 
No:333 


MYB 


149 


V-uXylJ d.VJLc±il UiyclUJJiatj lud-^d vj.x.ctJ. 

oncogene homolog 




SEQ ID 
No: 3 54 


SEQ ID 
No:355 


XBP1 


162 


x-box binding protein 1 


SEQ ID 
No : 3 8 5 


SEQ ID 
No : 3 8 6 


SEQ ID 
No : 3 8 7 


SRP 


164 


serum response factor (c-fos serun 
response element-bindinc 
transcription factor) 


SEQ ID 
No: 391 


SEQ ID 
No: 392 


SEQ ID 
No: 393 


SOX9 


165 


sry (sex determining region y) -bo> 
9 (campomelic dysplasia, autosomal 
sex-reversal) 


: SEQ ID 
No: 394 




SEQ ID 
No:395 


ESTs 
H21879 Sc 
H21880 


183 


Homo sapiens plasminogen activatoi 
(PLAT) 


- SEQ ID 
J No:433 


SEQ ID 
No: 434 





Tables 8A and 8B hereunder relate to two sub 
populations of polynucleotide sequences particularly 
5 interesting to distinguish tumors sensitive to antracycline 

from tumors unsensitive to antracycline. 
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TABLEAU 8A 



overexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3 ' 


Seq5 1 


Ref 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
NO: 77 


SEQ ID 
No: 78 


KIAA1075 


136 


kiaal075 protein 


No: 322 


SEQ ID 
No: 323 




MMP11 


145 


matrix metalloproteinase 
11 (stromelysin 3) 


SEQ ID 
No: 345 




SEQ ID 
No:346 


MYB 


149 


v-myb avian 
myeloblastosis viral 
oncogene homolog 




SEQ ID 
No: 3 54 


SEQ ID 
No: 355 


GZMA 


169 


granzyme a (granzyme 1, 
cytotoxic t- lymphocyte - 
associated serine 
esterase 3) 


SEQ ID 
No:402 




SEQ ID 
No: 403 



5 TABLEAU 8B 



underexpressed genes : top 5 



Gene 
symbol 


SET 
No 


Name 


Seq3' 


Seq5' 


Ref 


S0X4 


11 


sry (sex determining 
region y) -box 4 


SEQ ID 
NO: 22 


SEQ ID 
No: 23 


SEQ ID 
No: 24 


IL2RB 


40 


interleukin 2 receptor, 
beta 


SEQ ID 
No: 97 


SEQ ID 
No: 98 


SEQ ID 
No: 99 


EGFR 


57 


epidermal growth factor 
receptor (avian 
erythroblastic leukemia 
viral (v-erb-b) oncogene 
homolog) 


SEQ ID 
No: 135 


SEQ ID 
No: 136 


SEQ ID 
No: 13 7 


IL2RG 


119 


interleukin 2 receptor, 
gamma (severe combined 
immunodeficiency) 


SEQ ID 
NO: 279 


SEQ ID 
No:280 


SEQ ID 
No:281 


CD3G 


174 


cd3g antigen, gamma 
polypeptide (tit3 complex) 


SEQ ID 
No: 414 


SEQ ID 
No: 4 15 


SEQ ID 
No: 416 



Tables 9, 9A and 9B hereunder relates to sub 
populations of polynucleotide sequences particularly- 
interesting in classifying good and poor prognosis primary 
breast tumors . 
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TABLE 9 



symbol 


SET 
No 


Name 


Seg3 1 


Seq5 ' 


Ref 


CTSB 


14 


cathepsin b 




SEQ ID 
No: 30 


SEQ ID 
No: 31 


VIL2 


23 


villin 2 (ezrin) 


SEQ ID 
No: 51 


SEQ ID 
No: 52 


SEQ ID 
No: 53 


' MUC1 


25 


mucin 1, transmembrane 




SEQ ID 
No: 57 


SEQ ID 
No: 58 


EMR1 


27 


egf-like module 
containing, mucin-like, 
hormone receptor- like 
sequence 1 


SEQ ID 
No: 62 


SEQ ID 
No: 63 


SEQ ID 
No: 64 


KIAA0427 


28 


kiaa0427 gene product 


SEQ ID 
No: 65 


SEQ ID 
No: 66 


SEQ ID 
No: 67 


GATA3 


32 


gata-binding protein 3 


SEQ ID 
No: 76 


SEQ ID 
No: 77 


SEQ ID 
No: 78 


PRLR 


39 


prolactin receptor 


SEQ ID 
No: 94 


SEQ ID 
No: 95 


SEQ ID 
No: 96 


GATA3 


41 


gata-binding protein 3 


SEQ ID 
No:100 


SEQ ID 
No: 101 


SEQ ID 
No: 78 


TC21 


44 


oncogene tc21 


SEQ ID 
No: 106 


SEQ ID 
No:107 


SEQ ID 
No: 108 


BCL2 


48 


b-cell cll/lymphoma 2 


SEQ ID 
No: 115 


SEQ ID 
No: 116 


SEQ ID 
No: 117 


GATA3 


51 


gata-binding protein 3 


SEQ ID 
No: 122 




SEQ ID 
No: 78 


CRABP2 


64 


cellular retinoic acid- 
binding protein 2 


SEQ ID 
No: 156 


SEQ ID 
No: 157 


SEQ ID 
No: 158 


ANG 


81 


angiogenin, ribonuclease , 
rnase a family, 5 




SEQ ID 
No:194 


SEQ ID 
No: 195 


EGF 


83 


epidermal growth factor 
(beta-urogastrone) 


SEQ ID 
No: 199 




SEQ ID 
No: 200 


THBS1 


91 


thrombospondin 1 


SEQ ID 
No: 216 




SEQ ID 
No: 217 


EDNRA 


96 


endothelin receptor type a 


SEQ ID 
No:228 




SEQ ID 
No: 229 


SMARCA2 


99 


swi/snf related, matrix 
associated, actin 
dependent regulator of 
chromatin, subfamily a, 
member 2 


SEQ ID 
No:235 


SEQ ID 
No:236 


SEQ ID 
No: 23 7 


ABCB1 


108 


atp-binding cassette, sub- 
family b (mdr/tap) , member 

1 


SEQ ID 
No:257 




SEQ ID 
No:258 


EGF 


110 


epidermal growth factor 
(beta-urogastrone) 


SEQ ID 
No:262 




SEQ ID 
No: 2 00 


BIRC4 


116 


baculoviral iap repeat - 
containing 4 


SEQ ID 
No: 273 




SEQ ID 
No: 274 


DAP 3 


117 


death associated protein 3 


SEQ ID 




SEQ ID 
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Gene 
symbol 


SET 
No 


Name 


Seg3 ' 


Seq5 1 


Ref 








No:275 




No:276 


GNRH1 


118 


gonadotropin-releasing 
hormone 1 (leutinizing- 
releasing hormone) 




SEQ ID 
No:277 


SEQ ID 
NO:278 


DAP 3 


120 


death associated protein 3 


SEQ ID 
No : 2 8 2 


SEQ ID 
No • "2 8 3 


SEQ ID 
No : 2 7 6 


EST 
R97218 


126 


ests, highly similar to 
tvhurae hepatocyte growth 
factor receptor precursor 
[h. sapiens] 


SEQ ID 
No:296 


SEQ ID 
No:297 




BCL2 


142 


b-cell c 11 /lymphoma 2 


SEQ ID 
No:337 


SEQ ID 
No: 338 


SEQ ID 
No: 117 


BS69 


144 


adenovirus 5 ela binding 
protein 


SEQ ID 
No: 342 


SEQ ID 
No:343 


SEQ ID 
No: 344 


MYB 


149 


v-myb avian myeloblastosis 
viral oncogene homolog 




SEQ ID 
No: 354 


SEQ ID 
No:355 


CTSB 


152 


cathepsin b 


SEQ ID 
No:361 




SEQ ID 
NO: 31 


MLANA 


153 


melan-a 


SEQ ID 
No:362 


SEQ ID 
No:363 


SEQ ID 
No:364 


APR-1 


154 


apr-1 protein 


SEQ ID 
No : 3 6 5 


SEQ ID 
No ; 3 6 6 


SEQ ID 
Ho : 3 67 


TC21 


157 


oncogene tc21 


SEQ ID 
No : 3 72 


SEQ ID 
No : 3 7 3 


SEQ ID 

JNO . 1 U O 


CDKN3 


159 


cyclin-dependent kinase 
inhibitor 3 (cdk2- 
associated dual 
specificity phosphatase) 


SEQ ID 
No: 3 77 


SEQ ID 
No: 378 


No: 3 79 


XBP1 


162 


x-box binding protein 1 


SEQ ID 
No: 385 


SEQ ID 
No:386 


SEQ ID 
No: 3 87 


CDH15 


166 


cadherin 15, m-cadherin 
(myotubule) 


SEQ ID 
No:396 


SEQ ID 
No:397 


SEQ ID 
No: 398 


BCL2 


167 


b-cell cll/lymphoma 2 


SEQ ID 
No:399 


SEQ ID 
No:400 


SEQ ID 
No:117 


EST 
W73386 


168 


ests 


SEQ ID 
No:401 






ILF1 


171 


interleukin enhancer 
binding factor 1 


SEQ ID 
No:406 


SEQ ID 
No:407 


SEQ ID 
No : 4 0 8 


ARHGDIA 


172 


rho gdp dissociation 
inhibitor (gdi) alpha 


SEQ ID 
No:409 


SEQ ID 
No:410 


SEQ ID 
No:411 


C4A 


173 


complement component 4a 


SEQ ID 
No: 4 12 




SEQ ID 
No: 413 


ESR1 


176 


estrogen receptor 1 


SEQ ID 
No: 42 0 


SEQ ID 
No: 421 


SEQ ID 
No:422 


PBX1 


177 


pre -b-cell leukemia 
transcription factor 1 


SEQ ID 
No: 423 


SEQ ID 
No: 424 


SEQ ID 
NO:425 


GLI3 


178 


gli-kruppel family member 
gli3 (greig 


SEQ ID 
No: 426 


SEQ ID 
No: 427 


SEQ ID 
No: 428 
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No 
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Seq3 1 


SeqS 1 


Ref 






c e pha 1 op o 1 y syndac t yl y 
syndrome ) 








ILF1 


179 


interleukin enhancer 
binding factor 1 


SEQ ID 
No:429 




SEQ ID 
No:408 


ESTs 
H24628 & 
H24592 


184 


Homo sapiens aminoacylase 
1 (ACY1) . 


SEQ ID 
No:435 


SEQ ID 
No:436 




EST 
H28056 


185 


Homo sapiens E74-like 
factor 1 (ets domain 
transcription factor) 
(ELF1) 


SEQ ID 
No:437 







TABLE 9A 



Gene 
symbol 


SET Name 
N° 


Seq3 ' 


Seq5 1 


Ref 


VIL2 


23 villin 2 (ezrin) 


SEQ ID 
No: 51 


SEQ ID 
No: 52 


SEQ ID 
NO: 53 


MUC1 


25 mucin 1, transmembrane 




SEQ ID 
No: 57 


SEQ ID 
No: 58 


GATA3 


32 gata-binding protein 3 


SEQ ID 

No :76 


SEQ ID 
No: 77 


No: 78 


GATA3 


41 gata-binding protein 3 


SEQ ID 
No: 100 


SEQ ID 
No: 101 


SEQ ID 
No: 78 


BCL2 


48 b-cell cll/lymphoma 2 


SEQ ID 
No:115 


SEQ ID 
No: 116 


SEQ ID 
No: 117 


GATA3 


51 gata-binding protein 3 


SEQ ID 
NO:122 




SEQ ID 
No: 78 


CRAB P 2 


cellular retinoic acid-binding 
64 . . „ 
protein 2 


SEQ ID 
No:156 


SEQ ID 
No: 157 


SEQ ID 
No:158 


ANG 


angiogenin, ribonuclease , 
rnase a family, 5 




SEQ ID 
No: 194 


SEQ ID 
No: 195 


EGF 


epidermal growth factor (beta- 
urogastrone) 


SEQ ID 
No:199 




SEQ ID 
NO: 2 00 


THBS1 


91 thrombospondin 1 


SEQ ID 
No:216 




SEQ ID 
No: 217 


SMARCA2 


swi/snf related, matrix 
associated, actin dependent 
regulator of chromatin, 
subfamily a, member 2 


SEQ ID 
No:235 


SEQ ID 
No: 236 


SEQ ID 
No: 237 


EGF 


epidermal growth factor (beta- 
urogastrone) 


SEQ ID 
No: 262 




SEQ ID 
No:200 


BIRC4 


baculoviral iap repeat - 
containing 4 


SEQ ID 
No: 273 




SEQ ID 
No: 274 


BCL2 


142 b-cell cll/lymphoma 2 


SEQ ID 
No:337 


SEQ ID 
NO:338 


SEQ ID 
No: 117 
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Gene 
symbol 


SET 

N o Name 


Seq3 1 


Seg5 ' 


Ref 


BS69 


144 adenovirus 5 ela binding 
protein 


SEQ ID 
No: 342 


SEQ ID 
NO:343 


SEQ ID 
No: 344 


MYB 


149 v-m y b avian myeloblastosis 
viral oncogene homolog 




SEQ ID 
No:354 


SEQ ID 
No: 355 


XBP1 


162 x-box binding protein 1 


SEQ ID 
No:385 


SEQ ID 
No:386 


SEQ ID 
No:387 


BCL2 


167 b-cell cll/lymphoma 2 


SEQ ID 
No: 3 99 


SEQ ID 
No:400 


SEQ ID 
No: 117 


ILF1 


171 interleukin enhancer binding 
factor 1 


SEQ ID 
NO:406 


SEQ ID 
No:407 


SEQ ID 
No:408 


ARHGDIA 


172 rho gdp dis sociation inhibitor 
(gdi) alpha 


SEQ ID 
No: 409 


SEQ ID 
No: 410 


SEQ ID 
No: 411 


C4A 


173 complement component 4a 


SEQ ID 
No: 412 




SEQ ID 
No:413 


ESR1 


176 estrogen receptor 1 


SEQ ID 
No: 42 0 


SEQ ID 
No: 421 


SEQ ID 
No: 422 


PBX1 


Dre-b-rf 3 ! "1 l qiiItqtyii i 
177 f- 1 -^ leuKemia 

transcription factor l 


SEQ ID 
No:423 


SEQ ID 
No: 424 


SEQ ID 
No: 425 


GLI3 


gli-kruppel family member gli3 
j- / u ^-tipiicixopoxysynciactyiy 
syndrome) 


SEQ ID 
No:426 


SEQ ID 
No:427 


SEQ ID 
No:428 


ILF1 


17 interleukin enhancer binding 
factor 1 


SEQ ID 
No: 429 




SEQ ID 
No : 4 0 8 


ESTs 
H24628 & 
H24592 


184 Homo sapiens aminoacylase l 
(ACYl) . 


SEQ ID 
No. -435 


SEQ ID 
No:436 




EST 
H28056 


Homo sapiens E74-like factor 1 
185 (ets domain transcription 
factor) (ELF1) 


SEQ ID 
NO:437 







TABLE 9B 



Gene 
symbol 


SET No Name 


Seq3- 


Seq5 ' 


Ref 


CTSB 


14 cathepsin b 




SEQ ID 
No: 30 


SEQ ID 
No: 31 


EMR1 


egf-like module containing, 
27 mucin-like, hormone receptor- 
like sequence 1 


SEQ ID 
No: 62 


SEQ ID 
NO: 63 


SEQ ID 
NO: 64 


KIAA0427 


28 kiaa0427 gene product 


SEQ ID 
No: 65 


SEQ ID 
No: 66 


SEQ ID 
No: 67 


PRLR 


39 prolactin receptor 


SEQ ID 
No: 94 


SEQ ID 
NO: 95 


SEQ ID 
No: 96 


TC21 


44 oncogene tc21 


SEQ ID 
No: 106 


SEQ ID 
No: 107 


SEQ ID 
No:108 
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Gene 
symbol 


SET No Name 


Seq3 * 


Seq5' 


Ref 


EDNRA 


96 endothelin receptor type a 


SEQ ID 
No: 228 




SEQ ID 
No -.229 


ABCB1 


atp -binding cassette, sub- 
family b (mdr/tap) , member 1 


SEQ ID 
No: 257 




SEQ ID 
No:258 


DAP 3 


117 death associated protein 3 


SEQ ID 
No : 2 75 




SEQ ID 
No: 276 


GNRH1 


gonadotropin- releasing 
118 hormone 1 (leutinizing- 
releasing hormone) 




SEQ ID 
No: 277 


SEQ ID 
No: 2 78 


DAP 3 


12 0 death associated protein 3 


SEQ ID 
No : 282 


SEQ ID 
No : 283 


SEQ ID 
No : 276 


EST 
R97218 


ests, highly similar to 
tvhume hepatocyte growth 
factor receptor precursor 
[h. sapiens] 


SEQ ID 
No:296 


SEQ ID 
No:297 




CTSB 


152 cathepsin b 


SEQ ID 
No:361 




SEQ ID 
NO:31 


MLANA 


153 melan-a 


SEQ ID 
No:362 


SEQ ID 
No: 363 


SEQ ID 
NO:364 


APR-1 


154 apr-1 protein 


SEQ ID 
No:365 


SEQ ID 
No: 366 


SEQ ID 
No: 367 


TC21 


157 oncogene tc21 


SEQ ID 
No:372 


SEQ ID 
No: 373 


SEQ ID 
No:108 


CDKN3 


cycl in- dependent kinase 
159 inhibitor 3 (cdk2-associatec 
dual specificity phosphatase) 


SEQ ID 
No:377 


SEQ ID 
NO.-378 


SEQ ID 
No: 379 


CDH15 


cadherin 15, m-cadherir 
66 (myotubule) 


SEQ ID 
No:396 


SEQ ID 
No: 397 


SEQ ID 
No:398 


EST 
W73386 


168 ests 


SEQ ID 
No:401 







Overexpression of genes detected by using at 
least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
5 indicated in table 9A combined with underexpression of genes 

detected with at least one polynucleotide sequence selected 
among those included in each one of predefined polynucleotide 
sequence indicated on table 9B present a Good outcome. 

So, a preferred DNA array according to the 
10 invention comprises at least one polynucleotide sequence 

selected among those included in each one of predefined 
polynucleotide sequences indicated in table 9A and at least 
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one polynucleotide sequence selected among those included in 
each one of predefined polynucleotide sequence indicated on 
table 9B. 

Such DNA arrays are particularly useful to 
distinguish patients having a high risk (Bad Outcome) from 
those having a good pronostic (Good Outcome) . 
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(mRNA) 
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Listing 


5 
z 

Q 

a 

LU 

CO 


SEQ ID No:252 


o 




SEQ ID No:328 


SEQ ID No:119 




SEQ ID No:330 


SEQ ID No:58 


SEQ ID No:376 


SEQ ID No:149 


SEQIDNo:161 


b 
z 
g 

o 

LU 

co 


SEQ ID No:6 


SEQ ID No:325 


g 

o 

UJ 
00 


SEQ ID No:266 


SEQIDNO-.130 


SEQIDNo:67 


z 

D 

a 

LU 
CO 


SEQ ID No:38 


SEQ ID No:31 


z 
q 
a 

LU 

CO 


Seq5 ' 
PCT 
Listing 


SEQ ID No:40 


SEQ ID No:251 


o 


D 

s 

W 


SEQ ID No:327 


SEQ ID No:118 


z 
g 

a 

UJ 
09 


SEQ ID No:329 j 


SEQ ID No:57 


SEQ ID No:375 


z 
g 
a 

LU 
CO 


SEQ ID No:160 


z 

Q 

i 

CO 


SEQ ID No:5 


SEQ ID No:324 


SEQ ID No:176 


SEQ ID No:265 


P 
z 
g 
a 

UJ 

cn 


Z 
g 

a 

LU 
CO 


SEQ ID No:370 


SEQ ID No:37 


SEQ ID Mo:30 


SEQ ID No;44 


Seq3 1 
PCT 
Listing 


SEQ ID No:39 


SEQ ID No:250 


SEQIDNo:16 


SEQ ID No:438 


z 
9 

a 

LU 

CO 


o 


SEQ ID No:238 


o 




SEQ ID No:374 


SEQ ID No:147 


SEQIDNo:159 


SEQ ID No:27 


o 






o 


o 


SEQ ID No:65 




SEQ ID No:36 






















































Seq5' 
US PROV 
LISTING 


SEQ ID No : 30 


SEQ ID No : 32 j 




SEQ ID No : 34 


SEQ ID No : 36 | 




SEQ ID No : 39 


SEQ ID No : 41 | 


SEQ ID No : 42 | 


SEQ ID No ; 44 


SEQ ID No : 46 


SEQ ID No : 43 


SEQ ID No : 50 


SEQ ID No : 51 


SEQ ID No : 52 


SEQ ID No : 53 


SEQ ID No : 54 


SEQ ID No : 55 


SEQ ID No : 57 


SEQ ID No : 58 


SEQ ID No : 60 


SEQ ID No : 61 


SEQ ID No : 62 


Seq3 ' 1 
US PROV 
LISTING 


SEQ ID No : 29 


SEQ ID No : 31 I 


SEQ ID No : 33 




SEQ ID No : 35 


SEQ ID No : 37 


SEQ ID No : 38 


SEQ ID No : 40 




SEQ ID No : 43 


SEQ ID No : 45 


SEQ ID No : 47 


SEQ ID No : 49 












SEQ ID No : 56 




SEQ ID No : 59 






Image 


120916 


183030 


110480 


182264 


214008 


147016 


179197 


220451 


125413 ' 


290007 


152802 


153350 


112500 


109569 


212366 


154244 


187547 


150361 


127507 


276727 


116781 


112622 


123871 


Mom 


sctonucleotide 

pyrophosphatase/phosphodiesterase 
7iantataxin\ CENPP21 (ex PDNP2) 


activating transcription factor 3 (ATF3) 


< 
z 


selectin P (granule membrane protein 

140kn antinan OD621 fSELPl 


cadherin 1 , E-cadherin (epithelial) (CDH1 ) 


v-erb-b2 avian erythroblastic leukemia viral 
oncogene homolog 2 (neuro/glioblastoma 
rlRrived oncoaene homoloa) (ERBB2) 


(PP2A BR gamma) 


zincfinger protein 144 (Mel-18) (ZNF144) 


mucin 1 , transmembrane (MUC1 ) 


CD44E (epithelial form) 


phospholipase A2, group IIA (platelets, 
synovial fluid) (PLA2G2A), nuclear gene 


activin A receptor type ll-like 1 (ACVRL1) 


AXL receptor tyrosine kinase (AXL) 


KU-alpha, partial cds (new gene symbol 
Tlk21 


u 
1 

< 

If 

In Q. 
it 

St 


endothelin receptor type B (EDNRB), 


diphtheria toxin receptor (heparin-binding 


insulin-like growth factor 1 receptor (IGF1 R) 


1 
1 


I 
1 

>. 

% 

o 

% 

c 

J> 

ro 

a c 

O CC 


fibroblast growth factor receptor 4 (FGFR4) 


EST T85683 cathepsin B (CTSB) 


EST R00569 IL2-inducible T-cell kinase 
(ITK) 


a 




c 




















o 


CO 






CO 






CO 


% 








Symbole 
gene 


PDNP2 


ATF3 i 


< 
z 


SELP 


CDH1 I 


ERBB2 


PP2A BR 


ZNF144 ! 


a 
=i 

5 






PLA2G2A 


ACVRL1 


AXL 


PKU-ALPHA 


ABCC5 


EDNRB 


DTR 


IGF1R 


KIAA0427 


CD69 


FGFR4 


ESTT85683 


EST R00569 
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(mRNA) 
PCT 
Listing 


SEQIDNo:313 


Q 

O 
UJ 
CO 


<: 

2 

* 


SEQ ID No:87 


SEQ ID No:241 


h2 

z 

q 
c 

LL! 
CO 


z 
c 

o 

-Li 

CO 


o 


I SEQ ID No:53 


z 
a 
a 

LU 
CO 


z 
9 

a 

UJ 

CO 


SEQ ID No:360 


SEQ ID No:300 


z 

D 

c 

LU 
CO 


o 


SEQ ID No:272 


SEQ ID No:295 


SEQ ID No:110 


SEQ ID No:96 


z 
9 
a 

CO 


SEQ ID No:220 


SEQ ID No:103 


SEQ ID No:390 | 


Seq5 » 
PCT 
Listing 


SEQIDNo:312 


SEQIDNo:131 


<: 
z 


SEQ ID No:86 


o 


SEQ ID No:277 


SEQIDNo:14 


o 


SEQ ID No:52 


j SEQIDNo:197 


o 


SEQ ID No:359 


SEQIDNo:369 


SEQ ID No:380 




o 


SEQ ID No:294 


SEQ ID No:109 


z 

Q 

2 

CO 


SEQ ID No:18 






I 


SEQ ID No: 


SEQ ID No: 


SEQ ID No: 


Seq3 1 
PCT 
Listing 


SEQ ID No:311 


o 


< 
z 


o 


SEQ ID No:240 


o 


SEQIDNo:13 


SEQ ID No:35 


SEQ ID No:51 


z 

D 

a 

LLI 

CO 


! SEQ ID No:221 


z 
Q 

a 

CO 


SEQ ID No:368 


o 


SEQ ID No:114 


SEQ ID No:271 


SEQ ID No:293 


o 


z 

Q 

2 

co 


SEQ ID No:17 


SEQIDNo:218 


o 


SEQ ID No:388 | 


















































Seq5» 
US PROV 
LISTING 


s 


s 


£ 






SEQ ID No : 70 


SEQ ID No : 72 




SEQ ID No : 75 


j SEQ ID No : 77 




SEQ ID No : 80 


SEQ ID No : 82 


SEQ ID No : 84 


SEQ ID No : 86 


SEQ ID No : 88 


SEQ ID No : 90 


SEQ ID No : 91 


SEQ ID No : 93 


SEQ ID No : 95 


SEQ ID No : 97 


SEQ ID No : 98 




SEQ ID No 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


Seq3 ' 
US PROV 
LISTING 


SEQ ID No : 63 




SEQ ID No : 66 




I SEQ ID No : 69 










r2 












oo 






SEQ ID No : 92 I 


SEQ ID No : 94 


SEQ ID No : 96 




SEQ ID No : 99 | 


| SEQ ID No 


SEQ ID No 


SEQ ID No . 


i SEQ ID No: 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


Image 


' 208118 


( 151149 


110599 


131504 


| 180219 


192688 


| 110387 


114879 


| 124701 


I 156979 | 


162004 | 


258584 


270549 | 


298242 


146635 


1 


198144 | 


144081 


138788 | 


110481 


161451 J 


139326 


309943 | 


Nom 


transforming growth factor, beta receptor II 
(TGFBR3) 


Insulin receptor (INSR) 


MAP/micratubule affinity-regulating kinase 
3 (MARK3) 


tissue inhibitor of metalloproteinase 2 
(TIMP2) 


EST R85557 thrombospondin 3 (THBS3) 


gonadotropin-re/easing hormone 1 
(GNRH1) 


fibroblast growth factor receptor 2 (FGFR2) 


INFKB2 


> 

is" 

> 


CD 
Z 
LU 

c 

| 


EphA2 (EPHA2) 


eAMP responsive element modulator 
(CREM) 


ets variant gene 5 (ETV5) j 


EST N68536 MAX-interacting protein 1 
(MXI1) 


EST R81126 lymphotoxin beta receptoi 
[LTBR) 


POU2F2) | 


r riend leukemia virus integration 1 (FLU) | 


yrosine kinase with immunoglobulin and 
apidermal growth factor homology domains] 
TIE) I 


>rolactin receptor (PRLR) | 


jrotein phosphatase 3 (formerly 2B), 
:atalytic subunit, gamma isoform 
calcineurin A gamma) (PPP3CC) (ex 
3 PP3CA) 


irotein tyrosine phosphatase, non-receptor 
ype 2 (PTPN2) 


ilacental growth factor, vascular endothelial 
irowth factor-related protein (PGF) 


jmor necrosis factor, alpha-induced| 


a 


T 








to 


(— 
t 


? 












Si 






in 










j^±± 

s 




S 


Symbole 
gene 


TGFBR3 


INSR 


MARK3 


TIMP2 


EST R85557 


GNRH1 


FGFR2 


NFKB2 


l VIL2 


ENG 


EPHA2 


CREM 


UJ 


EST N68536 


ESTR81126 


POU2F2 


LL 


UJ 
H 


| PRLR j 


PPP3CA 


PTPN2 


PGF 


< 

u. 
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SEQ ID No:179 


SEQ ID No:75 | 


SEQ ID No:85 


SEQ ID No:70 | 




z 

9 

o 

UJ 

co 


z 
9 
o 

UJ 

co 


SEQ ID No:50 I 


□ 

a 

UJ 


SEQ ID No:134 


SEQ ID No:292 


SEQ ID No:307 I 


SEQ ID No:4 


SEQIDNo:12 


o 


SEQ ID No: 72 j 


SEQ ID Mo:215 j 


SEQ ID No 


SEQ ID No. 


SEQ ID No 


Seq5 1 
PCT 
Listing 




co" 




I 


SEQIDNo:418 


cn 
Z 

□ 
a 

UJ 

CO 




SEQ ID No:74 ! 


o 


SEQ ID No:69 | 


SEQ ID No:205 




§ 

H 

D 

O 
CO 


SEQ ID No:49 


SEQ ID Mo:104 


SEQ ID No:133 


o 


SEQ ID No:306 


SEQ ID No:3 


SEQIDNo:11 


SEQ ID No:8 


SEQ ID No:71 


SEQIDNo:214 | 


SEQ ID No 


SEQ ID No: 


z 
□ 
a 

to 


Seq3' 
PCT 
Listing 




SEQ ID No:347 




SEQ ID No:308 


z 
9 
o 

LU 
CO 


SEQ ID No:296 


H 

9 

a 

LU 

CO 


SEQ ID No:73 


SEQ ID No:84 


SEQ ID No:68 i 


SEQ ID No:204 


SEQIDNo:1 I 


o 


SEQ ID No:48 


o 


o 


SEQ ID No:291 


SEQ ID No:305 






SEQ ID No:7 




o 


















































Seq5' 
US PROV 
LISTING 






SEQ ID No: 101 


SEQ ID No: 103 


SEQ ID No : 105 






1 


o 








r- 




o 


















SEQ ID No 


SEQ ID No 


SEQ ID No : 


SEQ ID No 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


Seq3' 
US PROV 
LISTING 




SEQ ID No : 100 




SEQ ID No : 102 


SEQ ID No: 104 


SEQ ID No: 106 


SEQ ID No : 107 


SEQ ID No: 108 








in 




SEQ ID No: 118 






SEQ ID No: 122 


SEQ ID No : 123 






SEQ ID No : 127 






SEQ ID No : 


SEQ ID No : 


SEQ ID No : 


Image 




236008 


153446 


207378 


66969 


200394 


154343 


129438 


131502 


128142 


158347 


o 
I 


153548 I 


124554 | 


141924 I 


151247 


196282 


205612 I 


109093 


110101 I 


109677 


129059 | 


160580 | 


Nom 


protein 3 (TNFAIP3) 


PHB (prohibitin) 


LIM domain protein (RIL) 


v-myb avian myeloblastosis viral oncogene 
homolog-like 2 (MYBL2) 


v-rel avian reticuloendotheliosis viral 
□ncogene homolog B (nuclear factor ot 
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CLAIMS 

1. A polynucleotide library useful in the 
molecular characterization' of a carcinoma, said library 
comprising a pool of polynucleotide sequences or subsequences 
thereof wherein said sequences or subsequences are either 
underexpressed or overpressed in tumor cells, further wherein 
said sequences or subsequences correspond substantially to 
any of the polynucleotide sequences set forth in any of SEQ 
ID Nos : 1 - 468 or the complement thereof. 

2. A polynucleotide library according to Claim 1 
wherein said polynucleotide sequences or subsequences thereof 
of said pool correspond to any combination of at least one 
polynucleotide selected among those included in anyone of the 
following predefined sets : 

SET 1: (SEQ ID NO:l; SEQ ID No : 2 ) ; SET 2: (SEQ ID 
No: 3; SEQ ID No : 4 ) ; SET 3: (SEQ ID No : 5 ; SEQ ID No: 6); SET 4: (SEQ 
ID No: 7, -SEQ ID No : 8 ) ; SET 5: (SEQ ID No : 9 ; SEQ ID NO:10); SET 6: 
(SEQ ID No:ll: SEQ ID No:12); SET 7: (SEQ ID No:13; SEQ ID 
No:14;SEQ ID No:15); SET 8: (SEQ ID No: 16); SET 9: (SEQ ID No:17; 
SEQ ID No:18; SEQ ID No:19); SET 10: (SEQ ID No:20; SEQ ID No:21); 
SET 11: (SEQ ID No:22; SEQ ID No:23; SEQ ID No:24); SET 12: (SEQ 
ID No:25; SEQ ID No:26); SET 13: (SEQ ID No:27; SEQ ID No:28; SEQ 
ID No:29); SET 14: (SEQ ID No:30; SEQ ID No:31); SET 15: (SEQ ID 
No:32; SEQ ID No:33; SEQ ID No:34)) ; SET 16 : (SEQ ID No:35) ; 
SET 17 : (SEQ ID No: 36; SEQ ID No: 37; SEQ ID No: 38) ; SET 18 : 
(SEQ ID No:39; SEQ ID No:40; SEQ ID NO:41) ; SET 19 : (SEQ ID 
No:42; SEQ ID No:43) ; SET 20 : (SEQ ID No:44; SEQ ID No:45) ; SET 
21 : (SEQ ID No:46; SEQ ID No: 47) ; SET 22 : (SEQ ID No:48; SEQ ID 
No:49; SEQ ID No:50) ; SET 23 : (SEQ ID No:51; SEQ ID No:52; SEQ 
ID No:53) ; SET 24: (SEQ ID No:54; SEQ ID No:55; SEQ ID No:56) ; 
SET 25: (SEQ ID No:57; SEQ ID No:58) ; SET 26: (SEQ ID No:59; SEQ 
ID No: 60; SEQ ID No: 61) ; SET 27: (SEQ ID No: 62; SEQ ID No: 63; SEQ 
ID No:64) ; SET 28: (SEQ ID No:65; SEQ ID No:66; SEQ ID No:67) ; 
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SET 29: (SEQ ID No: 68; SEQ ID No: 69; SEQ ID No: 70) ; SET 30: (SEQ 
ID NO: 71; SEQ ID No: 72) ; SET 31 : (SEQ ID No: 73; SEQ ID No: 74; 
SEQ ID No: 75) ; SET 32 : (SEQ ID No:76; SEQ ID No:77; SEQ ID 
No:78) ; SET 33 : (SEQ ID No:79; SEQ ID No:80; SEQ ID No:81) ; SET 
5 34: (SEQ ID No: 82; SEQ ID No:83) ; SET 35: (SEQ ID No:84; SEQ ID 

No: 85) ; SET 36: (SEQ ID No: 86; SEQ ID No: 87) ; SET 37: (SEQ ID 
No:88; SEQ ID No:89; SEQ ID No:90) ; SET 38: (SEQ ID No:91; SEQ ID 
No: 92; SEQ ID No: 93) ; SET 39: (SEQ ID No: 94; SEQ ID No: 95; SEQ ID 
No:96) ; SET 40: (SEQ ID NO:97; SEQ ID No:98; SEQ ID No:99) ; SET 
10 41: (SEQ ID No:100; SEQ ID No:101; SEQ ID NO:78) ; SET 42: (SEQ ID 

No: 102; SEQ ID No: 103) ;. SET 43: (SEQ ID No : 104; SEQ ID No: 105) ; 
SET 44: (SEQ ID No: 106; SEQ ID No: 107; SEQ ID No: 108) ; SET 45: 
(SEQ ID No: 109; SEQ ID No: 110) ; SET 46: (SEQ ID No: 111; SEQ ID 
No:112; SEQ ID No: 113) ; SET 47: (SEQ ID No:114) ; SET 48: (SEQ ID 
15 No: 115; SEQ ID No: 116; SEQ ID No: 117) ; SET 49: (SEQ ID No : 118; 

SEQ ID NO:119) ; SET 50: (SEQ ID No:120; SEQ ID No:121) ; SET 51: 
(SEQ ID No: 122; SEQ ID NO:78) ; SET 52: (SEQ ID No: 123; SEQ ID 
No: 124; SEQ ID No : 125) ; SET 53: (SEQ ID No: 126; SEQ ID No : 127; 
SEQ ID NO:128) ; SET 54: (SEQ ID No:129; SEQ ID No:130) ; SET 55: 
20 (SEQ ID No:131; SEQ ID No:132) ; SET 56: (SEQ ID No:133; SEQ ID 

NO:134) ; SET 57: (SEQ ID No:135; SEQ ID No : 136; SEQ ID No:137) ; 
SET 58: (SEQ ID No : 138; SEQ ID No : 139; SEQ ID No: 140) ; SET 59: 
(SEQ ID No: 141; SEQ ID No: 142; SEQ ID No: 143) ; SET 60: (SEQ ID 
No: 144; SEQ ID No: 145; SEQ ID No: 146) ; SET 61: (SEQ ID No : 147; 
25 SEQ ID No:148; SEQ ID No:149) ; SET 62: (SEQ ID No:150; SEQ ID 

No: 151; SEQ ID No: 152) ; SET 63: (SEQ ID No: 153; SEQ ID No : 154; 
SEQ ID No: 155) ; SET 64: (SEQ ID No: 156; SEQ ID No: 157; SEQ ID 
No:158) ; SET 65: (SEQ ID No:159; SEQ ID No:160; SEQ ID No: 161) ; 
SET 66: (SEQ ID No:162; SEQ ID No:163) ; SET 67: (SEQ ID No:164; 
3 0 SEQ ID No: 165) ; SET 68: (SEQ ID No: 166; SEQ ID No: 167; SEQ ID 

No:152) ; SET 69: (SEQ ID No:168; SEQ ID No:169; SEQ ID No:170) ; 
SET 70: (SEQ ID NO:171; SEQ ID No:172) ; SET 71: (SEQ ID No:173; 
SEQ ID NO:174; SEQ ID No:175) ; SET 72: (SEQ ID NO:176; SEQ ID 
No:177) ; SET 73: (SEQ ID No:178; SEQ ID No:179) ; SET 74: (SEQ ID 
35 No:180; SEQ ID No:181; SEQ ID No:182) ; SET 75: (SEQ ID No:183; 

SEQ ID No:184) ; SET 76: (SEQ ID No:185) ; SET 77: (SEQ ID No:186) 
; SET 78: (SEQ ID No: 187; SEQ ID No: 188) ; SET 79: (SEQ ID No : 189; 
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SEQ ID No:190; SEQ ID No:191) ; SET 80: (SEQ ID No:192; SEQ ID 
No: 193) ; SET 81: (SEQ ID No : 194; SEQ ID No: 195) ; SET 82: (SEQ ID 
No: 196; SEQ ID No : 197; SEQ ID No. -198) ; SET 83: (SEQ ID No: 199; 
SEQ ID No: 200) ; SET 84: (SEQ ID No : 201; SEQ ID No: 202; SEQ ID 
No: 203) ; SET 85: (SEQ ID No : 204; SEQ ID No: 205) ; SET 86: (SEQ ID 
No:206; SEQ ID No:207) ; SET 87: (SEQ ID No:208; SEQ ID No:209) ; 
SET 88: (SEQ ID No: 210; SEQ ID No: 211) ; SET 89: (SEQ ID No : 212; 
SEQ ID No: 213) ; SET 90: (SEQ ID No : 214; SEQ ID No : 215) ; SET 91: 

(SEQ ID No:216; SEQ ID No:217) ; SET 92: (SEQ ID No:218; SEQ ID 
No:219; SEQ ID No:220) ; SET 93: (SEQ ID No:221; SEQ ID No:222) ; 
SET 94: (SEQ ID No: 223; SEQ ID No: 224; SEQ ID No: 225) ; SET 95: 

(SEQ ID No: 226; SEQ ID No: 227) ; SET 96: (SEQ ID No: 228; SEQ ID 
No:229) ; SET 97: (SEQ ID No:230; SEQ ID No:231; SEQ ID No:232) ; 
SET 98: (SEQ ID No: 233 ; SEQ ID No: 234) ; SET 99: (SEQ ID No: 235; 
SEQ ID No:236; SEQ ID No:237) ; SET 100: (SEQ ID No:238; SEQ ID 
No:239) ; SET 101: (SEQ ID NO:240; SEQ ID NO:241) ; SET 102: (SEQ 
ID No: 242; SEQ ID No: 243; SEQ ID No: 244) ; SET 103: (SEQ ID 
No:245; SEQ ID No:246; SEQ ID No:247) ; SET 104: (SEQ ID No:248; 
SEQ ID No: 249) ; SET 105: (SEQ ID No: 250; SEQ ID No: 251; SEQ ID 
No:252) ; SET 106: (SEQ ID No:253; SEQ ID No: 254) ; SET 107: (SEQ 
ID No: 255; SEQ ID No: 256) ; SET 108: (SEQ ID No : 257; SEQ ID 
No:258) ; SET 109: (SEQ ID No:259; SEQ ID No:260; SEQ ID No:261) ; 
SET 110: (SEQ ID No: 262; SEQ ID No : 200) ; SET 111: (SEQ ID No : 263; 
SEQ ID No: 264) ; SET 112: (SEQ ID No: 265; SEQ ID No: 266) ; SET 
113: (SEQ ID No:267; SEQ ID No:268) ; SET 114: (SEQ ID No:269; SEQ 
ID No: 2 70) ; SET 115: (SEQ ID No: 271; SEQ ID No: 272) ; SET 116: 

(SEQ ID NO.-273; SEQ ID No:274) ; SET 117: (SEQ ID NO.-275; SEQ ID 
No:276) ; SET 118: (SEQ ID No:277; SEQ ID NO:278) ; SET 119: (SEQ 
ID No: 279; SEQ ID No: 280; SEQ ID No: 281) ; SET 120: (SEQ ID 
No:282 ; SEQ ID No:283; SEQ ID NO:276) ; SET 121: (SEQ ID No:284; 
SEQ ID No: 285) ; SET 122: (SEQ ID No : 286; SEQ ID No:287; SEQ ID 
No: 288) ; SET 123: (SEQ ID No: 289; SEQ ID No: 290) ; SET 124: (SEQ 
ID No: 291; SEQ ID No:292) ; SET 125: (SEQ ID No:293; SEQ ID 
No:294; SEQ ID No:295) ; SET 126: (SEQ ID No:296; SEQ ID No:297) ; 
SET 127: (SEQ ID No : 298; SEQ ID No : 299; SEQ ID No: 300) ; SET 128: 

(SEQ ID No:301; SEQ ID No:302; SEQ ID No:288) ; SET 129: (SEQ ID 
No:303 ; SEQ ID No:304) ; SET 130: (SEQ ID No:305; SEQ ID No:306; 
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SEQ ID No:307) ; SET 131: (SEQ ID No:308; SEQ ID NO:309; SEQ ID 
No:310) ; SET 132: (SEQ ID No:311; SEQ ID No:312; SEQ ID No:313) ; 
SET 133: (SEQ ID No:314; SEQ ID NO:315; SEQ ID No:316) ; SET 134: 
(SEQ ID No: 317; SEQ ID NO:318) ; SET 135: (SEQ ID No: 319; SEQ ID 
No: 320; SEQ ID No: 321) ; SET 136: (SEQ ID No: 322; SEQ ID No: 323) ; 
SET 137: (SEQ ID No:324; SEQ ID No:325) ? SET 138: (SEQ ID NO:326; 
SEQ ID No: 327; SEQ ID No: 328) ? SET 13 9: (SEQ ID No: 329; SEQ ID 
No:330) ; SET 140: (SEQ ID No:331; SEQ ID No:332; SEQ ID No:333) ; 
SET 141: (SEQ ID NO-.334; SEQ ID No:335; SEQ ID No:336) ; SET 142: 
(SEQ ID No:337; SEQ ID No:338; SEQ ID No:117) ; SET 143: (SEQ ID 
No:339; SEQ ID No:340; SEQ ID NO:341) ; SET 144: (SEQ ID No:342; 
SEQ ID No: 343; SEQ ID No: 344) ; SET 145: (SEQ ID No: 345; SEQ ID 
No:346) ; SET 146: (SEQ ID No:347; SEQ ID No:348; SEQ ID No:349) ; 
SET 147: (SEQ ID No:350; SEQ ID No:351) ; SET 148: (SEQ ID No:352; 
SEQ ID NO: 353) ; SET 149: (SEQ ID No: 354; SEQ ID No: 355) ; SET 
150: (SEQ ID No:356; SEQ ID No:357) ; SET 151: (SEQ ID No:358; SEQ 
ID No:359; SEQ ID No:360) ; SET 152: (SEQ ID No:361; SEQ ID NO:31) 
; SET 153: (SEQ ID No: 362; SEQ ID No : 363; SEQ ID No : 364) ; SET 
154: (SEQ ID No:365; SEQ ID NO:366; SEQ ID No:367) ; SET 155: (SEQ 
ID No:368; SEQ ID No:369; SEQ ID No:300) ; SET 156: (SEQ ID 
No:370; SEQ ID No:371) ; SET 157: (SEQ ID No:372; SEQ ID No:373; 
SEQ ID No: 108) ; SET 158: (SEQ ID No: 374; SEQ ID No: 375; SEQ ID 
No:376) ; SET 159: (SEQ ID No:377; SEQ ID No:378; SEQ ID No:379) ; 
SET 160: (SEQ ID No:380; SEQ ID No:381) ; SET 161: (SEQ ID No:382; 
SEQ ID No:383; SEQ ID No:384) ; SET 162: (SEQ ID No:385; SEQ ID 
No:386; SEQ ID No:387) ; SET 163: (SEQ ID No:388; SEQ ID No:389; 
SEQ ID No: 390) ; SET 164: (SEQ ID No: 391; SEQ ID No: 392; SEQ ID 
No:393) ; SET 165: (SEQ ID No:394; SEQ ID No:395) ; SET 166: (SEQ 
ID No:396; SEQ ID No:397; SEQ ID No:398) ; SET 167: (SEQ ID 
No:399; SEQ ID No:400; SEQ ID No: 117) ; SET 168: (SEQ ID No:401) ; 
SET 169: (SEQ ID No:402; SEQ ID No:403) ; SET 170: (SEQ ID No:404; 
SEQ ID No:405; SEQ ID No:318) ; SET 171: (SEQ ID No:406; SEQ ID 
No:407; SEQ ID No:408) ; SET 172: (SEQ ID No:409; SEQ ID No:410; 
SEQ ID No: 411) ; SET 173: (SEQ ID No: 412; SEQ ID No: 413) ; SET 
174: (SEQ ID No:414; SEQ ID No:415; SEQ ID No:416) ; SET 175: (SEQ 
ID No:417; SEQ ID No:418; SEQ ID No:419) ; SET 176: (SEQ ID 
No:420; SEQ ID No:421; SEQ ID No:422) ; SET 177: (SEQ ID No:423; 
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SEQ ID No:424; SEQ ID No:425) ; SET 178: (SEQ ID No:426; SEQ ID 
No:427; SEQ ID No:42S) ; SET 179: (SEQ ID No:429; SEQ ID No:408) ; 
SET 180: (SEQ ID No:430) ; SET 181: (SEQ ID No:431) ; SET 182: 
(SEQ ID NO:432) ; SET 183: (SEQ ID No:433; SEQ ID No:434) ; SET 
5 184: (SEQ ID No:435; SEQ ID No:436) ; SET 185: (SEQ ID No:437) ; 

SET 186: (SEQ ID No:438; SEQ ID No:439) ; SET 187: (SEQ ID No:440; 
SEQ ID NO: 441) ; SET 188: (SEQ ID No: 442) ; SET 189: (SEQ ID 
No:444) ; SET 190: (SEQ ID No:445) ; SET 191 (SEQ ID No:446 ; SEQ 
ID No:447) ; SET 192: (SEQ ID No:448) ; SET 193: (SEQ ID No:449) ; 

10 SET 194: (SEQ ID No:450): SET 195: (SEQ ID No:451) ; SET 196: (SEQ 

ID No:452) ; SET 197: (SEQ ID NO:453) ; SET 198: (SEQ ID No:454) ; 
SET 199: (SEQ ID No:455) ; SET 200: (SEQ ID No:456) ; SET 201: 
(SEQ ID No:457) ; SET 202: (SEQ ID No:458) ; SET 203: (SEQ ID 
NO:459) ; SET 204: (SEQ ID NO:460) ; SET 205: (SEQ ID No:461) ; 

15 SET 206: (SEQ ID No:462) ; SET 207: (SEQ ID No:463) ; SET 208: 

(SEQ ID No: 464) ; SET 209: (SEQ ID No: 465) ; SET 210: (SEQ ID 
No:466) ; SET 211: (SEQ ID No:467) ; SET 212: (SEQ ID No:468) 

3. A polynucleotide library according to Claim 
20 2 wherein said polynucleotide sequences or subsequences 

thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

25 

4 . A library according to anyone Claim 1 or 2 
wherein the pool of polynucleotide sequences or subsequences 
correspond substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 

3 0 one of predefined polynucleotide sequences sets comprising: 

SET 1: (SEQ ID No:l ; SEQ ID No:2) ; SET 4: (SEQ ID 
No: 7 ; SEQ ID No : 8 ) ; SET 18: (SEQ ID No : 3 9 ; SEQ ID NO:40 ; SEQ 
ID No:41) ; SET 21: (SEQ ID No:46 ; SEQ ID No:47) ; SET 24: (SEQ 
ID No:54 ; SEQ ID No:55 ; SEQ ID No:56) ; SET 32: (SEQ ID No:76 ; 

35 SEQ ID No:77 ; SEQ ID No:78) ; SET 38: (SEQ ID No:91 ; SEQ ID 
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No: 92 ; SEQ ID NO:93) ; SET 48: (SEQ ID No: 115 ; SEQ ID No : 116 ; 
SEQ ID No: 117) ; SET 53: (SEQ ID No : 12 6 ; SEQ ID No : 127 ; SEQ ID 
No:128) ; SET 58: (SEQ ID No:138 ; SEQ ID No: 139 ; SEQ ID No: 140) 
; SET 59: (SEQ ID No: 141 ; SEQ ID No: 142 ; SEQ ID No: 143) ; SET 
61: (SEQ ID No:147 ; SEQ ID No:148 ; SEQ ID No:149) ; SET 64: (SEQ 
ID No: 156 ; SEQ ID No: 157 ; SEQ ID No: 158) ; SET 66: (SEQ ID 
No: 162 ; SEQ ID No: 163) ; SET 69: (SEQ ID No: 168 ; SEQ ID No : 169; 
SEQ ID NO-.170) ; SET 73: (SEQ ID No:178; SEQ ID No:179) ; SET 85: 
(SEQ ID No:204; SEQ ID NO:205) ; SET 88: (SEQ ID No:210; SEQ ID 
No: 211) ; SET 91: (SEQ ID No: 216; SEQ ID No: 217) ; SET 97: (SEQ ID 
No:230; SEQ ID NO.-231; SEQ ID No:232) ; SET 104: (SEQ ID No:248; 
SEQ ID No: 249) ; SET 105: (SEQ ID No: 250 ; SEQ ID No: 251 ; SEQ ID 
No:252) ; SET 112: (SEQ ID No:265 ; SEQ ID No:266) ; SET 113: (SEQ 
ID No:267 ; SEQ ID No:268) ; SET 115 ; (SEQ ID NO:271 ; SEQ ID 
No:272) ; SET 131: (SEQ ID No:308 ; SEQ ID No:309 ; SEQ ID No:310) 
; SET 132: (SEQ ID No:311 ; SEQ ID No:312 ; SEQ ID No:313) ; SET 
134: (SEQ ID No:317 ; SEQ ID No:318) ; SET 137: (SEQ ID No:324 ; 
SEQ ID No:325) ; SET 145: (SEQ ID No:345 ; SEQ ID No:346) ; SET 
147: (SEQ ID NO:350 ; SEQ ID No:351) ; SET 155: (SEQ ID No:368 ; 
SEQ ID No:369 ; SEQ ID No:300) ; SET 175: (SEQ ID No:417 ; SEQ ID 
No:418 ; SEQ ID No:419) ; SET 180: (SEQ ID No:430) ; SET 181: (SEQ 
ID No:431) ; SET 182: (SEQ ID No:432) ; SET 185: (SEQ ID No:437) ; 
SET 187: (SEQ ID NO: 440 ; SEQ ID No: 441, 

wherein said sequences are useful in 
differentiating a normal cell from a cancer cell. 

5 . A polynucleotide library according to Claim 
4 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

6. A polynucleotide library according to Claim 4 
wherein the pool of polynucleotide sequences or subsequences 
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correspond substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising: 

SET 32: (SEQ ID No:76 ; SEQ ID No:77 ; SEQ ID No:78) 
5 ; SET 73: (SEQ ID No: 178 ; SEQ ID No: 179) ; SET 131: (SEQ ID 

No:308 ; SEQ ID NO:309 ; SEQ ID NO.-310) ; SET 145: (SEQ ID NO:345 
; SEQ ID No: 346) and SET 181: (SEQ ID No: 431) 

and of at least one polynucleotide sequence 
selected among those included in each one of predefined 
10 polynucleotide sequences sets comprising: 

SET 38: (SEQ ID NO:91 ; SEQ ID No:92 ; SEQ ID No : 93 ) 
; SET 58: (SEQ ID No:138 ; SEQ ID No:139 ; SEQ ID No:140); SET 61: 
(SEQ ID NO.-147 ; SEQ ID NO:148 ; SEQ ID No:149); SET 69: (SEQ ID 
No:168 ; SEQ ID No:169 ; SEQ ID No:170) and SET 182: (SEQ ID 
15 No:432) . 

7 A polynucleotide library according to Claim 
6 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
20 least one polynucleotide selected among those included in at 

least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

8. A library according to anyone Claim 1 or 2 
25 wherein the pool of polynucleotide sequences or subsequences 

correspond substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising: 

SET 11: (SEQ ID No: 22 ; SEQ ID No:23 ; SEQ ID No: 24) 
30 ; SET 26: (SEQ ID No:59; SEQ ID No:60 ; SEQ ID No: 61) ; SET 32: 

(SEQ ID No:76; SEQ ID No:77 ; SEQ ID No:78) ; SET 34: (SEQ ID 
No: 82 ; SEQ ID No: 83) ; SET 40: (SEQ ID No: 97 ; SEQ ID No: 98 ; SEQ 
ID No:99) ; SET 57: (SEQ ID No:135 ; SEQ ID NO:136 ;SEQ ID No:137) 
; SET 64: (SEQ ID No: 156 ; SEQ ID No: 157; SEQ ID No: 158) ; SET 
35 107: (SEQ ID No:255 ; SEQ ID No:256) ; SET 119: (SEQ ID No:279 ; 
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SEQ ID No: 280 ; SEQ ID No : 281) ; SET 136: (SEQ ID No: 322 ; SEQ ID 
No:323) ; SET 140: {SEQ ID No:331 ; SEQ ID No:332 ; SEQ ID No:333) 
; SET 141: (SEQ ID No:334; SEQ ID No:335 ; SEQ ID No:336) ; SET 
145: (SEQ ID No: 345; SEQ ID No: 346) ; SET 148: (SEQ ID No: 352; SEQ 
ID NO-.353) ; SET 149: (SEQ ID No:354 ; SEQ ID No:355) ; SET 162: 
(SEQ ID No:385; SEQ ID No:386; SEQ ID No:387) ; SET 165: (SEQ ID 
No:394 ; SEQ ID No:395) ; SET 169: (SEQ ID No:402 ; SEQ ID No:403) 
; SET 174: (SEQ ID No : 414 ; SEQ ID No : 415 ; SEQ ID No:416) and SET 
188: (SEQ ID No:442) , 

wherein said sequences are useful in detecting a 
hormone sensitive tumor cell 

9. A polynucleotide library according to Claim 
8 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

10. A library according to Claim 8 wherein the 
pool of polynucleotide sequences or subsequences correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising: 

SET 32: (SEQ ID No: 76 ; SEQ ID No: 77 ; SEQ ID No: 78) 
; SET 136: (SEQ ID No: 322 ; SEQ ID No: 323) ; SET 145: (SEQ ID 
No:345 ; SEQ ID No:346); SET 149: (SEQ ID NO:354 ; SEQ ID No:355) 
and SET 169: (SEQ ID No : 402 ; SEQ ID No : 403) 

and of at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets comprising: 

SET 11: (SEQ ID No: 22 ; SEQ ID No: 23 ; SEQ ID No: 24) 
; SET 40: (SEQ ID No:97 ; SEQ ID No:98 ; SEQ ID No:99); SET 57: 
(SEQ ID No:135 ; SEQ ID No:136 ; SEQ ID No:137); SET 119: (SEQ ID 
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No:279; SEQ ID NO:280 ; SEQ ID No:281) and SET 174: (SEQ ID No:414 
; SEQ ID NO:415 ; SEQ ID No:416) 

11. A polynucleotide library according to Claim 
10 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

12. A library according to anyone Claim 1 or 2 
wherein the pool of polynucleotide sequences or subsequences 
correspond substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising: 

SET 8: (SEQ ID No: 16) ; SET 11: (SEQ ID No: 22 ; SEQ 
ID No: 23 ; SEQ ID No: 24) ; SET 18: (SEQ ID No: 39 ; SEQ ID No: 40 ; 
SEQ ID No: 41) ; SET 25: (SEQ ID No: 57 ; SEQ ID NO:58) ? SET 32: 
(SEQ ID No: 76 ; SEQ ID No: 77 ; SEQ ID No: 78) ; SET 34: (SEQ ID 
No: 82 ; SEQ ID No:83) ; SET 40: (SEQ ID No : 97 ; SEQ ID No:98 ; SEQ 
ID No: 99) ; SET 49: (SEQ ID No : 118 ; SEQ ID No: 119) ; SET 57: (SEQ 
ID No: 135 ; SEQ ID No : 136 ; SEQ ID No : 137) ; SET 91: (SEQ ID 
No:216 ; SEQ ID No:217) ; SET 100: (SEQ ID No:238 ; SEQ ID No:239) 
; SET 105: (SEQ ID No:250 ; SEQ ID No:251: SEQ ID No:252) ; SET 
136: (SEQ ID No:322 ; SEQ ID No:323) ; SET 138: (SEQ ID No:326 ; 
SEQ ID No:327 ; SEQ ID No:328) ; SET 139: (SEQ ID NO:329 ; SEQ ID 
NO:330) ; SET 141: (SEQ ID No:334 ; SEQ ID No:335 ; SEQ ID NO:336) 
; SET 158: (SEQ ID NO:374 ; SEQ ID No:375 ; SEQ ID No:376) ; SET 
169: (SEQ ID No:402 ; SEQ ID No:403) ; SET 180: (SEQ ID NO:430) 
and SET 186: (SEQ ID No:438 ; SEQ ID No:439), 

wherein said sequences are useful in 
differentiating a tumor with lymph nodes from a tumor without 
lymph nodes. 
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13. A polynucleotide library according to Claim 
12 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

14 . A library according to Claim 12 wherein the 
pool of polynucleotide sequences or subsequences correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising 

SET 18: (SEQ ID No:39 ; SEQ ID No:40 ; SEQ ID No:41) 
; SET 32: (SEQ ID No: 76 ; SEQ ID No: 77 ; SEQ ID No: 78) ; SET 57: 
(SEQ ID No:135 ; SEQ ID No:136; SEQ ID No:137); SET 91: (SEQ ID 
No:216 ; SEQ ID No:217) and SET 105: (SEQ ID No:250 ; SEQ ID 
No:251 ; SEQ ID No:252) 

and of at least one polynucleotide sequence 
selected among those included in each one of predefined 
polynucleotide sequences sets comprising: 

SET 11: (SEQ ID No: 22 ; SEQ ID No: 23; SEQ ID No: 24) ; 
SET 40: (SEQ ID No:97; SEQ ID No: 98 SEQ ID No: 99) ; SET 49: 
(SEQ ID No: 118 ; SEQ ID No:119) ; SET 100: (SEQ ID No:238 ; SEQ ID 
No: 239) and SET 141: (SEQ ID No: 334; SEQ ID No: 335 ; SEQ ID 
NO:336) . 

15 . A polynucleotide library according to Claim 
14 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

16 . A library according to anyone of Claims 1 or 
2 wherein the pool of polynucleotide sequences or 
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subsequences correspond substantially to any combination of 
at least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets comprising: 

SET 11: (SEQ ID No: 22 ; SEQ ID NO:23 ; SEQ ID No: 24) 
; SET 22: (SEQ ID No:48 ; SEQ ID No:49 ; SEQ ID NO:50) ; SET 23: 
(SEQ ID No: 51 ; SEQ ID No : 52 ; SEQ ID No: 53) ; SET 26: (SEQ ID 
No: 59 ; SEQ ID No: 60 ; SEQ ID No: 61) ; SET 28: (SEQ ID No: 65 ; SEQ 
ID No:66 ; SEQ ID No:67) ; SET 31: (SEQ ID No:73 ; SEQ ID No:74 ; 
SEQ ID No: 75) ; SET 32: (SEQ ID No: 76 ; SEQ ID No: 77 ; SEQ ID 
No:78) ; SET 34: (SEQ ID No:82 ; SEQ ID No:83) ; SET 49: (SEQ ID 
No: 118 ; SEQ ID No: 119) ; SET 57: (SEQ ID No: 135 ; SEQ ID No : 136 ; 
SEQ ID No:137) ; SET 64: (SEQ ID NO:156 ; SEQ ID No:157 ; SEQ ID 
No: 158) ; SET 73: (SEQ ID No: 178; SEQ ID No : 179) ; SET 77: (SEQ ID 
No: 186) ; SET 81: (SEQ ID No: 194 ; SEQ ID No: 195) ; SET 95: (SEQ 
ID No:226 ; SEQ ID No:227) ; SET 131: (SEQ ID No:308 ; SEQ ID 
No:309 ; SEQ ID NO:310) ; SET 138: (SEQ ID No:326 ; SEQ ID No:327 
; SEQ ID No:328) ; SET 140: (SEQ ID No:331 ; SEQ ID No:332 ; SEQ 
ID No: 333) ; SET 149: (SEQ ID No: 354 ; SEQ ID No: 355) ; SET 162: 
(SEQ ID NO.-385 ; SEQ ID No:386 ; SEQ ID No:387) ; SET 164: (SEQ ID 
No:391 ; SEQ ID No:392 ; SEQ ID No:393) ; SET 165: (SEQ ID No:394 
; SEQ ID No:395) and SET 183: (SEQ ID No:433 ; SEQ ID No:434), 

wherein said sequences are useful in 
differentiating antracycline-sensitive tumors from 

antracycline-insensitive tumors . 

17. A polynucleotide library according to Claim 
16 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

18. A library according to Claim 16 wherein the 
pool of polynucleotide sequences or subsequences correspond 
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substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising 

SET N° 32: (SEQ ID No:76; SEQ ID No:77; SEQ ID No:78) 
5 ; SET N°136: (SEQ ID No : 322 ; SEQ ID No: 323) ; SET N° 145: (SEQ ID 

No:345; SEQ ID NO.-346) ; SET N° 149: SEQ ID No:354; SEQ ID No:355) 
; SET N°169: (SEQ ID No:402 ; SEQ ID No:403) 

and of at least one polynucleotide sequence 
selected among those included in each one of predefined 
10 polynucleotide sequences sets comprising: 

SET NO 11: (SEQ ID No: 22; SEQ ID No:23 ; SEQ ID 
No:24); SET No 40: (SEQ ID No:97 ; SEQ ID No:98 ; SEQ ID No:99) ; 
SET No 57: (SEQ ID NO:135 ; SEQ ID No:136 ; SEQ ID No:137) ; SET 
No 119: (SEQ ID No : 279 ; SEQ ID No: 280 ; SEQ ID No: 281) ; SET No 
15 174: (SEQ ID No:414 ; SEQ ID NO.-415; SEQ ID No:416). 

19. A polynucleotide library according to Claim 
18 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 

2 0 least one polynucleotide selected among those included in at 

least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 

20. A library according to anynone of Claims 1 or 
2 5 2 wherein the pool of polynucleotide sequences or 

subsequences correspond substantially to any combination of 
at least one polynucleotide sequence selected among those 
included in each one of predefined polynucleotide sequences 
sets comprising 

30 SET No 14 (SEQ ID No:30; SEQ ID NO: 31) ; SET No 23 

(SEQ ID No:51; SEQ ID No:52; SEQ ID No:53) ; SET No 25 (SEQ ID 
No: 57; SEQ ID NO:58) ; SET No 27 (SEQ ID No: 62; SEQ ID NO:63; SEQ 
ID No: 64) ; SET No 2 8 (SEQ ID No: 65; SEQ ID No: 66; SEQ ID No: 67) ; 
SET No 32 (SEQ ID No: 76; SEQ ID No: 77; SEQ ID No: 78) ; SET No 39 

35 (SEQ ID No:94; SEQ ID No:95; SEQ ID No:96) ; SET No 41 (SEQ ID 
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No: 100; SEQ ID NO:101; SEQ ID No:78) ; SET No 44 (SEQ ID No : 106; 
SEQ ID No: 107; SEQ ID No: 108) ; SET No 48 (SEQ ID No: 115; SEQ ID 
No: 116; SEQ ID No: 117) ; SET No 51 (SEQ ID No: 122; SEQ ID No: 78) ; 
SET No 64 (SEQ ID No:156; SEQ ID No:157; SEQ ID No:158) ; SET No 
81 (SEQ ID No: 194; SEQ ID No: 195) ; SET No 83 (SEQ ID No: 199; SEQ 
ID No: 200) ; SET No 91 (SEQ ID No: 216; SEQ ID No: 217) ; SET No 96 
(SEQ ID No: 228; SEQ ID No : 229) ; SET No 99 (SEQ ID No: 235; SEQ ID 
No:236; SEQ ID No:237) ; SET No 108 (SEQ ID No:257; SEQ ID No:258) 
; SET No 110 (SEQ ID NO.-262; SEQ ID No:200) ; SET No 116 (SEQ ID 
No: 273; SEQ ID No : 274) ; SET No 117 (SEQ ID No: 275; SEQ ID No: 276) 
; SET No 118 (SEQ ID No:277; SEQ ID No:278) ; SET No 120 (SEQ ID 
No: 282; SEQ ID No: 283; SEQ ID No : 276) ; SET No 126 (SEQ ID No : 296; 
SEQ ID No:297;) ; SET No 142 (SEQ ID No:337; SEQ ID No:338; SEQ ID 
No: 117) ; SET No 144 (SEQ ID No: 342; SEQ ID No: 343; SEQ ID No: 344) 
; SET No 149 (SEQ ID No:354; SEQ ID No:355) ; SET No 152 (SEQ ID 
No: 361; SEQ ID No: 31) ; SET No 153 (SEQ ID No: 362; SEQ ID No : 363 ; 
SEQ ID No:364) ; SET No 154 (SEQ ID No:365; SEQ ID No:366; SEQ ID 
No:367) ; SET No 157 (SEQ ID No:372; SEQ ID NO:373; SEQ ID No:108) 
; SET No 159 (SEQ ID No: 377; SEQ ID No: 378; SEQ ID No : 379) ; SET 
No 162 (SEQ ID NO:385; SEQ ID No:386; SEQ ID No:387) ; SET No 166 
(SEQ ID No:396; SEQ ID No:397; SEQ ID No:398) ; SET No 167 (SEQ ID 
No:399; SEQ ID NO:400; SEQ ID No:117) ; SET No 168 (SEQ ID No:401) 
; SET No 171 (SEQ ID NO:406; SEQ ID No:407; SEQ ID No:408) ; SET 
No 172 (SEQ ID No:409; SEQ ID No:410; SEQ ID No:411) ; SET No 173 
(SEQ ID No:412; SEQ ID No:413) ; SET No 176 (SEQ ID No:420; SEQ ID 
No:421; SEQ ID NO:422) ; SET No 177 (SEQ ID No:423; SEQ ID No:424; 
SEQ ID No:425) ; SET No 178 (SEQ ID NO:426; SEQ ID No:427; SEQ ID 
No:428) ; SET No 179 (SEQ ID No:429; SEQ ID No:408) ; SET No 184 
(SEQ ID No:435; SEQ ID No:436) ; SET No 185 (SEQ ID No:437), 

wherein said sequences are useful in classifying 
good and poor prognosis primary breast tumors. 

21. A polynucleotide library according to Claim 
20 wherein said polynucleotide sequences or subsequences 
thereof of said pool correspond to any combination of at 
least one polynucleotide selected among those included in at 
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least 50%, preferably 75% and more preferably 100% of the 
predefined sets. 
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22. A library according to Claim 2 0 wherein the 
pool of polynucleotide sequences or subsequences correspond 
substantially to any combination of at least one 
polynucleotide sequence selected among those included in each 
one of predefined polynucleotide sequences sets comprising 

SET N° 23 (SEQ ID No: 51 ; SEQ ID No: 52 ; SEQ ID 
No:53) ; SET N° 25 (SEQ ID No:57 ; SEQ ID No:58) ; SET N° 32 (SEQ 
ID No: 76 ; SEQ ID No: 77 ; SEQ ID No: 78) ; SET N° 41 (SEQ ID No: 100 
; SEQ ID No: 101 ; SEQ ID No: 78) ; SET N° 48 (SEQ ID No : 115 ; SEQ 
ID No: 116 ; SEQ ID No : 117) ; SET N° 51 (SEQ ID No: 122 ; SEQ ID 
No:78) ; SET N° 64 (SEQ ID NO:156 ; SEQ ID No:157 ; SEQ ID No:158) 
; SET N° 81 (SEQ ID No: 194 ; SEQ ID No: 195) ; SET N° 83 (SEQ ID 
No:199 ; SEQ ID No:200) ; SET N° 91 (SEQ ID No:216 ; SEQ ID 
No: 217) ; SET N° 99 (SEQ ID No: 235 ; SEQ ID No: 236 ; SEQ ID 
No:237) ; SET N° 110 (SEQ ID NO:262 ; SEQ ID No:200) ; SET N° 116 
(SEQ ID No:273 ; SEQ ID No:274) ; SET N° 142 (SEQ ID NO:337 ; SEQ 
ID No:338 ; SEQ ID No:117) ; SET N° 144 (SEQ ID No:342 ; SEQ ID 
No: 343 ; SEQ ID No: 344) ; SET N° 149 (SEQ ID No: 354 ; SEQ ID 
No:355) ; SET N° 162 (SEQ ID No:385 ; SEQ ID No:386 ; SEQ ID 
No:387) ; SET N° 167 (SEQ ID No:399 ; SEQ ID No:400 ; SEQ ID 
No:117) ; SET N° 171 (SEQ ID No:406 ; SEQ ID No:407 ; SEQ ID 
NO:408) ; SET N° 172 (SEQ ID No:409 ; SEQ ID No:410 ; SEQ ID 
No:411) ; SET N° 173 (SEQ ID NO:412 ; SEQ ID No:413) ; SET N° 176 
(SEQ ID NO: 420 ; SEQ ID No: 421 ; SEQ ID No: 422) ; SET N° 177 (SEQ 
ID No: 423 ; SEQ ID No: 424 ; SEQ ID No: 425) ; SET N° 178 (SEQ ID 
NO.-426 ; SEQ ID NO.-427 ; SEQ ID No:428) ; SET N° 179 (SEQ ID 
No:429 ; SEQ ID No:408) ; SET N° 184 (SEQ ID No:435 ; SEQ ID 
No:436) ; SET N° 185 (SEQ ID No:437), 

and at least one polynucleotide sequence 

selected among those included in each one of predefined 
polynucleotide sequences sets comprising: 

SET No 14 (SEQ ID No: 30 ; SEQ ID No: 31) ; SET No 27 
(SEQ ID No: 62 ; SEQ ID No: 63 ; SEQ ID No.-64) ; SET No 28 (SEQ ID 
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